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Preface 


The  purpose  of  this  thesis  is  to  demonstrate  the  feasibility  of  using  a  qualitative  analysis  method  to 
assess  and  evaluate  computer  security  vulnerabilities.  The  primary  motivation  for  this  research  is 
to  assist  the  United  States  Air  Force  (USAF)  in  assessing  and  eliminating  the  vulnerabilities 
identified  in  USAF  computer  systems.  Although  the  main  focus  of  this  thesis  is  to  evaluate 
computer  security  vulnerabilities,  the  methods  involved  have  application  in  other  areas  requiring 
evaluation  using  qualitative  methods. 

This  thesis  puts  head-to-head  a  quantitative  approach  to  analysis  and  a  qualitative 
approach  utilizing  linguistic  variables.  Linguistic  variables  are  represented  using  a  calculus 
described  by  Lofti  Zadeh  (Zadeh6S:338).  Linguistic  variables  are  terms  such  as  Low,  Medium, 
and  High.  In  the  realm  of  vulnerability  analysis,  these  have  a  definite  semantic  meaning.  It  is 
proposed,  and  demonstrated  by  this  thesis,  that  the  use  of  qualitative  analysis  using  linguistic 
variables  to  describe  the  impact  of  computer  security  vulnerabilities  is  not  only  feasible,  but 
intrinsically  easier  to  understand  and  use  than  quantitative  methods. 

In  developing  the  necessary  computer  programs  and  writing  this  thesis,  I  have  had  a  great 
deal  of  help  from  others.  I  am  most  indebted  to  my  faculty  advisor.  Major  Gregg  Gunsch.  His 
consistent  support  and  guidance  was  always  felt  and  much  appreciated  in  the  many  times  of  need 
and  confusion.  I  v'ish  to  thank  Drs.  Henry  Potoczny  and  Eugene  Santos  for  serving  on  my  thesis 
committee.  I  also  wish  to  acknowledge  the  Air  Force  Cryptological  Support  Center  (AFCSC)  for 
their  generous  sponsorship  of  this  work. 

Finally,  I  wish  to  thank  my  wife,  Kim,  and  my  children,  Dakin  and  Mandi,  for  their  never- 
ending  support,  concern,  and  love  as  I  spent  the  many  days  and  nights  sequestered  in  my  office 
with  this  work. 
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Al^tract 

This  thesis  demonstrates  the  feasibility  of  using  qualitative  analysis  methods  to  evaluate 
computer  security  vulnerabilities.  Although  many  risk  analysis  systems  exist,  few  provide  for  the 
adequate  analysis  of  identified  vulnerabilities.  While  the  main  focus  of  this  thesis  is  to  evaluate 
computer  security  vulnerabilities,  the  methods  involved  have  ^plication  in  other  areas  requiring 
evaluation  using  qualitative  methods. 

It  is  proposed,  and  demonstrated  by  this  thesis,  that  the  use  of  qualitative  analysis  using 
linguistic  variables  to  describe  the  impact  of  computer  security  vulnerabilities  is  not  only  feasible, 
but  intrinsically  easier  to  understand  and  use  than  quantitative  methods. 


VULNERABILITY  ANALYSIS 


USING  A  FUZZY  LOGIC  BASED  METHOD 


I  Introduction 


1.1  General  Issue 

This  thesis  describes,  develops,  and  compares  automated  analysis  methods  providing 
support  capabilities  to  security  analysts  in  the  evaluation  of  computer  security  vulnerabilities. 

1.2  Background 

Computer  security  is  a  major  Air  Force  concern.  Air  Force  agencies  and  organizations  use 
computers  in  almost  every  aspect  of  their  operations.  To  maintain  even  minimal  operational 
capability,  organizations  must  increase  their  dependence  on  these  machines.  Associated  with  this 
increased  dependence  is  an  incretise  in  associated  costs,  both  tangible  and  intangible,  resulting  from 
compromised  computer  resources.  Compromised  resources  occur  as  the  result  of  vulnerabilities 
being  exploited.  These  vulnerabilities  include,  but  are  not  limited  to,  unauthorized  access, 
sabotage,  physical  damage,  and  accidental  misuse. 

To  ensure  the  protection  of  computer  resources,  the  USAF  implemented  AFR  205-16, 
Security:  Computer  Security  Policy.  The  purpose  of  this  regulation  is  to  ".  .  .  protect  the 
confidentiality,  integrity,  and  availability  of  information  processed  on  all  Air  Force  computer 
systems"  (DAF89:1).  Air  Force  Systems  Security  Instruction  (AFSSI)  5100,  The  Air  Force 
COMPUSEC  Program  and  AFSSI  5102,  Computer  Security  for  Operational  Systems  later 
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supplemented  AFR  205-16.  AFSSI  5102  requires  each  facility  processing  information  for  the 
USAF  to  perform  a  security  risk  analysis  on  each  installed  computer.  This  requirement  includes 
both  government-  and  contractor-owned  facilities  (DAF93b:2).  This  requirement  to  perform  a 
security  risk  analysis  applies  not  only  to  Air  Force  systems,  but  as  directed  by  the  Office  of 
Management  and  Budget  of  the  United  States  Circular  Number  A-71,  to  ".  .  .  every  federal 
department  or  agency  operating  one  or  more  computer  installations . . ."  (Carroll84:2) 

Risk  analysis  identifies  threats  and  vulnerabilities  associated  with  a  given  computer  system 
and  determines  if  the  safeguards  in  effect  adequately  protect  the  system  from  compromise.  After 
eliminating  the  risk,  or  reducing  the  risk  to  acceptable  levels,  the  computer  security  officer  (CSO) 
authorizes  the  computer  system  to  be  operated  at  a  maximum  sensitivity  level  and  for  a  certain 
mission.  Any  change  to  the  computer  system,  whether  hardware,  software,  or  mission,  mandates 
performing  another  risk  analysis. 

When  the  Air  Force  established  AFR  205-16,  most  of  the  computers  in  the  Air  Force  were 
large  single-site  mainframes.  Although  very  complicated  and  large  systems,  the  configuration  and 
intended  usage  of  these  mainframes  did  not  change  often.  CSOs  could  manage  the  required 
number  of  risk  analyses  manually.  With  the  advent  of  personal  computers  and  desktop 
workstations,  the  required  number  of  analyses  became  too  difficult  to  manage.  AFCSC,  located  at 
Kelly  AFB,  TX,  developed  the  Automated  Risk  Evaluation  System  (ARES)  to  address  this 
management  problem. 

ARES  is  a  computer  program  that  assists  computer  security  personnel  in  performing  a  risk 
analysis.  The  computer  program  guides  the  user  through  a  myriad  of  questions  pertaining  to  all 
aspects  of  computer  security.  After  the  user  answers  all  of  the  pertinent  questions,  ARES 
generates  a  series  of  reports  identifying  the  possible  vulnerabilities  associated  with  a  particular 
computer  system. 
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The  listing  of  identified  vulnerabilities  generated  by  ARES  does  not  provide  any  indication 
as  to  the  importance  of  each  vulnerability.  The  CSO  must  still  evaluate  the  importance  of  each 
vulnerability.  The  manual  process  of  evaluating  computer  security  vulnerabilities  is  very  labor 
intensive.  To  help  ease  this  workload,  this  thesis  presents  two  automated  methods  possibly  useful 
in  the  evaluation  of  computer  security  vulnerabilities. 

At  this  point,  I  need  to  stress  a  few  items.  First,  regardless  of  the  method  used  the 
evaluation  of  vulnerabilities  by  a  CSO  is  subjective.  Security  analysis  is  not  an  absolute  and  the 
degree  of  importance  assigned  to  a  vulnerability  may  differ  from  CSO  to  CSO. 

Second,  the  evaluation  of  vulnerabilities  is  context  sensitive  with  regard  to  location, 
hardware,  software,  and  mission  requirements.  This  implies  that  one  site  might  identify  a 
vulnerability  as  trivial  while  another  site  might  identify  the  same  vulnerability  as  critical.  Lack  of 
backup  power  is  an  example  of  a  context  sensitive  vulnerability.  Not  having  backup  power  for  an 
air  traffic  control  system  would  probably  be  critical  while  not  having  backup  power  for  electronic 
message  system  might  be  trivial.  Even  in  this  example,  the  use  of  the  system  is  context  sensitive. 
If  the  air  traffic  control  system  is  for  a  small  airport  and  the  airport  only  uses  the  system  as  a 
secondary  method  of  air  traffic  control,  the  system  may  not  be  critical  to  flight  operations,  hence, 
the  system  may  not  require  backup  power.  Likewise,  the  vulnerability  of  not  having  backup  power 
is  critical  if  world  leaders  use  the  electronic  message  system  for  communication.  This  implies  that 
the  automated  method  must  allow  for  each  CSO  to  tailor  the  importance  of  vulnerabilities  to  meet 
site  specific  requirements. 

Last,  and  most  important,  is  that  there  is  no  proven  and  demonstrable  "industry  standard” 
method  for  analyzing  computer  security  vulnerabilities.  As  such,  the  methods  discussed  and 
developed  in  this  thesis  cannot  be  proven  to  be  any  better  or  worse  than  any  other  method.  This 
thesis  does  demonstrate  that  the  methods  presented  have  merit  with  regard  to  their  ability  to 
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analyze  computer  security  vulnerabilities.  Their  applicability  to  a  specific  computer  site  is  subject 
to  that  site's  requirements  and  existing  analysis  techniques. 

1.3  What  are  Risk  and  Vulnerability  Analysis? 

Before  continuing,  u  is  essential  to  define  the  differences  between  risk  and  vulnerability 
analysis.  Risk  is  defined  as  "the  possibility  of  loss"  (Ca]Toll84:xv)  while  vulnerability  is  defined  as 
"a  weakness  or  lack  of  controls  that  would  allow  or  facilitate  a  threat  actuation  against  a  specific 
asset  or  target"  (Podell86;88).  How  these  terms  differ  is  best  demonstrated  by  an  exanqrle.  The 
computer  system  is  identified  as  not  having  a  password  capability.  This  is  a  vulnerability.  The 
loss  or  compromise  of  data  because  the  system  doesn't  have  a  password  capability  is  a  risk.  In 
other  words,  risks  are  caused  by  a  vulnerability  being  exploited. 

A  key  component  of  any  risk  analysis  is  vulnerability  identification  and  analysis 
(Carroll84;137).  Vulnerability  analysis  is  the  evaluation  of  identified  vulnerabilities  to  determine 
the  importance  or  impact  of  each  vulnerability  with  respect  to  all  other  identified  vulnerabilities. 
The  importance  or  impact  is  usually  with  regard  to  confidentiality,  availability,  and  integrity  of  the 
system.  In  vulnerability  analysis,  countermeasures  are  not  considered  and  no  evaluation  is 
performed  with  regard  to  expected  loss  (CarToll84:90).  In  other  words,  the  end  product  of  a 
vulnerability  analysis  is  the  categorization  and  clustering  of  vulnerabilities  by  importance  or 
impact. 

Risk  analysis  is  the  "analysis  of  system  assets  and  vulnerabilities  to  determine  the 
system's  exposure  or  expected  loss"  (Podell86:84).  In  order  to  perform  an  effective  risk  analysis,  a 
complete  vulnerability  analysis  must  be  performed.  The  categorization  of  the  vulnerabilities 
performed  by  the  vulnerability  analysis  and  the  cost  information  associated  with  each  vulnerability 
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being  exploited  are  combined  and  analyzed  in  a  risk  analysis.  The  fuial  product  of  the  risk  analysis 
indicates  which  clustering  of  vulnerabilities  and  cost  provide  the  greatest  exposure  or  expected 
loss.  Risk  can  only  be  effectively  analyzed  if  all  vulnerabilities  are  identified  and  evaluated. 

1.4  Motivation 

The  primary  motivation  for  this  research  is  a  need  to  automate  and  standardize  computer 
vulnerability  analysis.  A  secondary  motivation  arose  as  a  result  of  discussions  with  the  managers 
of  ARES.  As  with  any  on-going  software  project,  ARES  is  under  constant  modificaticn  and 
revision.  One  of  the  revisions  planned  for  ARES  is  the  incorporation  of  a  risk  analysis 
methodology.  As  proposed  by  the  developers  and  maintainers  of  ARES,  this  risk  analysis 
methodology  would  use  a  quantitative  analysis  approach  (Trident93;8).  To  ensure  the  best  product 
is  fielded,  I  proposed  to  the  managers  of  ARES  that  a  qualitative  analysis  approach  might  be  more 
intuitive  for  the  end-user  to  understand  and  utilize,  while  still  maintaining  the  effectiveness  and 
efficiency  of  a  quantitative  method. 

This  research  is  the  first  phase  of  a  multi-phase  research  project.  This  thesis  specifically 
addresses  the  feasibility  of  using  a  qualitative  versus  quantitative  analysis  method  to  evaluate 
identified  vulnerabilities.  Future  phases  outlined  in  chapter  6  will  address  other  risk  analysis 
capabilities  and  implementation  details. 

1.5  Hypothesis 

The  hypothesis  of  this  thesis  is  stated  in  two  parts:  that  a  qualitative  analysis  approach  to 
vulnerability  analysis  is  as  effective  and  efficient  as  a  quantitative  ^proach  and  that  the  qualitative 
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approach  provides  the  security  analyst  with  intuitive  information  not  readily  available  in 
quantitative  approaches.  Effectiveness  is  measured  as  the  ability  to  provide  reasonable 
categorizations  of  identified  vulnerabilities  based  on  in^rtance  or  impact.  Efficiency  is  measured 
with  regard  to  processing  time  and  scalability  of  the  method. 

1.6  Research  Objectives 

This  thesis  has  two  objectives.  First,  to  identify  and  develop  automated  methods  that 
provide  a  consistent  evaluation  of  vulnerabilities.  Second,  to  evaluate  each  method  for 
effectiveness  and  efficiency. 

1.7  Scope 

The  scope  of  this  thesis  is  limited  to  identifying,  developing,  and  evaluating  two  methods  capable 
of  evaluating  computer  security  vulnerabilities.  The  vulnerabilities  used  as  test  data  in  this  thesis 
are  a  subset  of  the  possible  vulnerabilities  generated  by  ARES. 

1.8  Document  Structure 

This  thesis  contains  six  chapters  and  two  appendices.  The  motivation  and  background  for 
this  research  is  provided  in  this,  the  first  chapter.  Also  provided  in  this  chapter  are  the  research 
objectives.  The  second  chapter  provides  an  overview  and  discussion  of  vulnerability  analysis 
methods  and  various  automated  analysis  methods.  The  third  chapter  discusses  the  methodologies 
used  in  this  thesis  and  the  justification  for  using  those  methods.  In  the  fourth  chapter,  the  details  of 
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how  the  methodologies  were  implemented  are  provided.  The  fifth  chapter  presents  the  results  of 

this  research,  and  the  conclusions  and  recommendations  are  provided  in  the  sixth  chapter. 

The  fust  appendix  contains  a  high  level  overview  of  the  Lisp  program  code  implemented 

for  each  of  the  methods.  The  second  appendix  contains  sample  output  generated  from  two  of  the 

implemented  methods.  Each  of  the  sample  runs  was  genererated  using  the  same  sample  of  SO 

vulnerabilities.  The  complete  source  code  and  sample  data  files  can  be  obtained  by  contacting: 

Major  Gregg  Gunsch  (AI  Lab) 

AFTT/ENG 
2950  P  Street 

Wright-Patterson  AFB,  OH  45433-7765 
ggunsch  @afit.af.niil 

1.9  Summary 

There  is  a  need  to  automate  the  analysis  of  identified  computer  security  vulnerabilities. 
The  Air  Force  developed  ARES,  a  system  capable  of  identifying  vulnerabilities,  but  not  providing 
any  analysis  capabilities.  Two  general  categories  of  analysis  methods,  quantitative  and  qualitative, 
can  be  used  to  perform  vulnerability  analysis.  In  the  field  of  vulnerability  analysis,  there  is 
currently  no  accepted  standard  method.  This  thesis  hypothesizes  that  the  qualitative  methods  are 
comparably  effective  and  efficient  in  performing  this  analysis  and  provide  intuitive  information  to 
the  analyst  as  compared  to  quantitative  methods.  To  support  this  hypothesis,  both  a  numeric  and 
non-nuiroric  analysis  method  were  developed  and  implemented. 


7 


n  Historical  Development 


2.1  Introduction 

Beginning  with  the  first  ENIAC  computer  used  by  the  War  Department  during  Woiid  War 
II.  there  has  been  a  need  to  ensure  the  security  of  computing  resources.  Since  the  first  computers 
were  very  large  in  size  and  all  components  located  in  one  facility,  physical  security  of  the  system 
was  adequate  to  protect  the  resources.  With  the  advent  of  multi-user  systems  came  the  need  to 
incorporate  protection  mechanisms  into  the  computer's  operating  system.  These  |»otection 
schemes  continued  to  evolve  to  enconqiass  aspects  of  physical,  data,  user,  and  ravirraimental 
security. 

As  computers  and  their  associated  operating  environments  became  increasingly  complex,  it 
became  proportionally  difficult  to  protect  these  resources.  To  assist  the  security  managers  in 
analyzing  the  risk  to  their  computing  resources,  several  analysis  methods  were  devised.  These 
methods  involve  using  some  form  of  reasoning  with  uncertainty  to  assist  in  the  risk  analysis. 

Reasoning  with  uncertainty  is  an  essential  part  of  performing  risk  analysis.  If  the  CSO 
could  eliminate  all  of  the  uncertainty  in  computer  security,  he  could  sinqily  look  up  in  a  table  what 
safeguard  to  put  in  place  for  each  vulnerability.  However,  all  of  the  uncertainty  cannot  be 
eliminated.  The  introduction  of  new  computer  technology  introduces  new  risks.  With  ttese  new 
risks  come  new  uncertainties  and  changes  in  existing  uncertainties.  Also,  as  the  dependmce  on 
computer  technology  increases,  so  do  the  associated  costs  of  these  new  risks.  Because  of  these 
ever  changing  conditions,  the  analysis  mediod  used  by  the  CSO  must  have  the  ability  to  reason 
with  uncertainty. 
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2.2  Overview  of  Problem 


There  are  two  basic  problems  with  evaluating  computer  security  vulnerabilities.  The  first 
is  determining  what  aspects  of  computer  security  vulnerabilities  to  evaluate.  This  involves 
establishing  the  functional  areas  that  each  vulnerability  affects.  For  instance,  user  IDs  affect 
access  control  and  operating  system  capabilities.  The  second  {nroblem  is  then  determining  how  to 
evaluate  the  vulnerabilities  affecting  each  fimctional  area  in  order  to  im>vide  an  overall 
vulnerability  rating  for  each  functional  area. 

For  the  purpose  of  this  thesis,  I  defined  seven  functional  areas  that  each  vulnerability  can 
affect.  These  are  audit  capabilities,  recovery  capabilities,  access  control,  magnetic  media 
control,  operating  system  capabilities,  configuration  control,  and  documentation.  Each 
vulnerability  may  affect  mote  than  one  of  these  functional  areas.  These  seven  areas  could  be 
further  divided  in  mote  detailed  areas  and  probably  should  be  in  a  fielded  inq>lementation.  For 
instance,  access  control  could  be  further  divided  into  hardware  protection  (e.g.,  call-back 
modems)  and  software  protection  (e.g.,  password)  control  schemes.  For  the  purposes  of  this  thesis, 
however,  I  will  only  demonstrate  the  capability  of  the  methods  to  handle  these  seven  functitmal 
areas.  Each  method  tested  will  use  the  same  seven  functional  areas  and  vulnerabilities  to  generate 
a  vulnerability  rating  for  each  functional  area. 

A  large  problem  faced  in  this  thesis  was  deciding  how  to  evaluate  the  functional  areas  and 
the  vulnerabilities  affectii.g  each  area.  My  problem  arose  from  deciding  which  method  best 
reasons  with  uncertainty.  Giarratano  and  Riley  define  uncertainty  as  "...  the  lack  of  adequate 
information  to  make  a  decision."  (Giarrat.u'.389:18S).  Uncertainty  comes  fix)m  several  sources. 
First,  uncertainty  occurs  when  some  or  all  of  the  data  is  unobtainable  or  missing.  Unobtainable 
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implies  the  data  is  not  available  from  any  source.  Second,  uncertainty  arises  wiien  the  expert 
provides  inexact  or  inconsistent  information.  Third,  uncertainty  comes  from  data  that  is  available, 
but  contains  errors  (Giarratano89:186-190,221).  Last,  I  would  add  that  uncertainty  includes 
information  difficult  to  quantify  such  as  loss  of  life  or  delay  in  mission. 

2.3  Reasoning  with  Uncertainty 

2.3.1  Introduction  of  AI  Methods.  As  stated  earlier,  a  problem  faced  in  this  thesis 
was  deciding  which  method  best  performs  reasoning  with  uncertainty.  The  sciences  of  Artificial 
Intelligence  (AI)  and  probability  theory  have  much  to  offer  towards  helping  solve  this  problem. 
Discussed  below  are  several  q)proaches  to  reasoning  with  uncertainty.  These  approaches  are  by 
no  means  the  only  methods  available,  but  they  are  representative  of  the  varied  'schools  of  thought' 
concerning  how  to  reason  with  uncertain  information.  Other  methods  include  Intuitionistic  Logic 
(Martin-Lof82),  Multiple  Valued  Logic  (Lukasiewicz67),  Variable  Value  Logic  (Michalski75), 
Variable  Precision  Logic  (Michalski86),  Default  Logic  (Rciter80,  Yager87),  Temporal  Logic 
(McDermott82),  and  Decision  Trees  (Quinlan82)  to  name  a  few  (Dontas87:2). 

2.3.2  Bayes’  Theorem.  The  most  well  known  method  to  deal  with  uncertainties  is 
Bayes'  Theorem  (Rich91:231,  Giarratano89:204,  Bacchus90:67).  This  theorem,  shown  in 
Equation  (1),  forms  the  basis  for  conditional  probability  theory  and  allows  for  calculating  die 
inverse  or  a  posteriori  probability.  If  the  prob^ility  of  event  A  occurring  given  event  B,  P(AIB), 
is  known  and  the  probability  of  event  B  occurring  unconditionally,  P(B),  is  known,  Bayes' 
Theorem  allows  the  calculation  of  the  probabilify  of  event  B  given  event  A,  P(BIA).  P(AIB)  and 
P(B)  are  a  priori  and  must  be  known  prior  to  solving  for  P(BIA),  the  a  posteriori  probability. 
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Bayes'  Hieoiein  has  one  advantage  over  all  of  the  other  medKxls  to  be  discussed.  Bayes' 
Theorem  produces  a  precise  and  mathematically  provable  solution  if  we  know  these  a  priori 
probabilities.  Bayes'  Theorem  has  several  large  drawbacks  that  make  it  unfeasible  to  use.  It 
should  be  noted  that  Bayes'  Theorem  is  relatively  easy  to  inclement  and  is  theoretically  easy  to 
understand,  but  for  most  real  world  problems  is  very  difficult  to  use.  First,  in  order  to  calculate  the 
a  posteriori  probability,  two  a  priori  probabilities,  P(AlBj)  and  PfBj),  have  to  be  known.  For 
many  real  worid  problems,  knowing  or  obtaining  these  probabilities  is  difficult,  if  not  in^iossible 
(Rich9 1:233,  Dillard91:l).  Second,  if  the  problem  deals  with  dependent  events,  and  thus  joint 
probabilities,  Bayes'  Theorem  grows  exponentially  and  becomes  computationally  intractable 
(Rich91:233). 


Z.A^b„).p{b„) 

where:  P(B{IA)  =  the  a  posteriori  probability  that  event  B|  occurs  given  event  A  occurs 

P(AIB|)  s  the  a  priori  probability  we  will  observe  event  A  given  event  Bj  occurs 
P(B{)  =  the  a  priori  probability  event  Bj  will  occur  independent  of  any  otW  event 
k  =  the  number  of  possible  events 

(Rich91;232) 


As  elegant  and  simple  as  Bayes'  Theorem  is  to  understand  and  implement,  its  drawbacks 
prevent  it  from  being  tpplicable  to  the  vulnerability  analysis  problem.  Its  inapplicability  arises 
from  the  varying  number  of  vulnerabilities  to  be  evaluated  and  the  unavailability  of  a  complete  set 
of  a  priori  probabilities.  It  could  be  argued  that  the  missing  probabilities  could  simply  be 
generated  by  the  expert  based  on  experience.  Evoi  with  a  complete  set  of  a  priori  probabilities, 
the  implementation  still  could  not  overcome  the  exponential  perfomuuice  characteristics  of  die 
theorem. 
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2.3.3  Bayesian  Networks.  A  variation  chi  the  pure  Bayes'  Theorem  is  Bayesian 
Netwoiics.  This  alternative,  developed  by  Judea  Pearl  (Pearl88).  uses  a  network  structure  to  model 
the  problem.  Pearl  hypothesized  that  instead  of  representing  the  problem  as  one  large  joint 
probability  distribution  as  required  by  Bayes'  Theorem,  die  i»Dblem  could  be  bitdren  up  into  a 
network  of  individual  nodes.  Each  node  is  probabilistically  independent  of  all  the  other  nodes  and 
therefore  would  not  suffer  from  the  exponential  growth  of  a  standard  Bayes'  ThecMem  approach. 
Those  events  that  are  dependent  and  therefore  must  be  'processed'  together  are  represented  together 
within  a  single  node  (Rich91:239,  Olivei90:387). 

Although  minimizing  the  combinatorial  effects  of  a  pure  Bayes'  Theorem  solution, 
Bayesian  Networks  still  require  the  same  a  priori  {nubabilities.  Because  of  a  lack  of  available 
data  on  the  probabilities  that  a  particular  vulnerability  will  be  exploited,  this  method  is  also  not 
directly  applicable  to  the  problem  of  vulnerability  analysis.  Another  reason  this  method  is  not 
^plicable  deals  with  the  generation  of  the  networic.  This  network,  which  shows  all  of  the 
dependent  relations  and  probabilities,  would  have  to  be  encoded  with  all  of  the  possible 
vulnerabilities.  As  mentioned  eariier,  new  technology  is  creating  new  vulnerabilities.  If  the 
Bayesian  Network  approach  was  used,  it  would  require  rebuilding  at  least  a  portion  of  the  network 
structure  each  time  a  new  vulnerability  was  identified  or  removed  (Oliver90:387).  Depending  on 
the  structure  of  the  network,  the  changes  may  only  have  to  be  made  to  a  single  node.  Changes  in 
vulnerabilities  carmot  sinq)ly  be  spliced  into  or  out  of  the  existing  network.  There  are  several 
update  methods  that  can  be  iq)plied  to  making  these  changes,  but  if  there  are  any  intersecting  nodes 
in  the  network,  all  of  the  update  methods  suffer  from  combinatorial  problems  (Oliver90:388).  The 
existing  network  is  built  based  (Hi  the  dependencies  of  the  known  vulnerabilities.  If  changes  are 
made  to  the  networic,  they  may  have  a  propagation  effect  upon  the  existing  dependencies  with  die 
net  result  being  a  complete  rebuilding  of  the  network. 
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2.3.4  Dempster-Slurfer.  Another  m^hodology  using  probabilities  is  Dempster-Shafer 
theory  (Dempsterb?,  Shafer76,  GiaiTatano89:27S).  Unlike  Bayes'  Theorem,  where  each  event  is 
treated  individually,  Dempster-Shafer  works  with  sets  of  events.  These  sets  of  events  are  mutually 
exclusive  and  for  each  set  of  events  a  probability  (tensity  function,  m,  is  defined.  In  actuality,  the 
probability  density  function  is  defined  not  only  for  the  set  of  events,  but  for  all  subsets  as  well.  If 
there  are  n  events,  there  are  2"  subsets  of  events.  According  to  Rich  and  Knight,  many  of  these 
subsets  of  events  are  insignificant  to  the  problem  and  their  probability  density  functions  return 
values  of  zero  (Rich9 1:243). 

The  significance  of  this  is  that  all  of  the  combinations  of  vulnerabilities  can  be  represented 
as  sets  of  vulnerabilities.  Some,  if  not  most,  of  the  subsets  of  vulnerability  combinations  would  be 
impossible  in  the  real  world  so  the  probability  density  function  would  return  a  value  of  zero. 

Dempster-Shafer  theory  intrxxluces  the  concepts  of  belief  and  plausibility.  Belief  is 
defined  as  the  minimum  support  provided  by  the  evidence  while  plausibility  is  the  maximum 
support  the  evidence  may  be  able  to  provide  to  the  set  of  events  (GiarTatano89:284).  Arrother 
concept  introduced  by  Dempster-Shafer  theory  is  ignorance.  This  concept  allows  for  asserting  a 
fact  with  a  known  confidence,  but  does  not  imply  that  the  uncertainty  of  the  fact  is  one  minus  the 
known  confidence  since  there  may  be  some  confidence  in  the  falsehcxxl  of  an  event  occurring.  In 
Dempster-Shafer  theory,  the  probability  density  function  can  describe  three  aspects  of  a  set  of 
hypotheses.  The  first  is  the  belief  the  set  of  hypotheses  is  true,  the  second  that  the  set  is  not  true, 
and  the  third  probability  distribution  represents  ignorance  (Giarratano89:280). 

A  simple  example  of  applying  Dempster-Shafer  would  be  to  poll  responses.  If  100  people 
were  polled  on  whether  a  law  should  be  passed,  some  responses  would  be  yes,  some  no,  and  some 
unknown  or  undecided.  For  discussion's  sake,  assume  there  were  40  yes  responses,  35  no 
responses,  and  25  undecided.  Dempster-Shafer  could  make  the  following  assertions.  There  is 
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belief  \\iax.  40%  of  the  respondents  support  passing  of  the  law,  but  it  is  plausible  that  65%  of  the 
respondents  will  support  passing.  There  is  also  belief  that  35%  of  the  respondents  support  not 
passing  the  law,  but  it  is  plausible  that  60%  of  the  respondents  will  not  support  the  passing  of  the 
law.  ITiere  is  also  ignorance  about  25%  of  the  respondent's  position.  At  some  later  point,  the 
respondents  with  the  unknown  answers  could  commit  to  yes  or  no  and  then  their  belies  could  be 
attributed  to  the  law  passing  or  not  passing  respectively.  Until  such  time  though,  there  remains 
some  ignorance  about  their  responses. 

A  major  problem  with  Dempster-Shafer  is  the  generation  of  the  probability  density 
functions.  E)empster-Shafer  also  requires  a  significant  amount  of  a  priori  information  (i.e., 
generation  of  the  probability  density  functions),  which  if  not  supported  with  known  probabilities, 
would  have  to  be  estimated. 

2.3.5  Fuzzy  Set  Theory.  The  last  approach  discussed  is  fiizzy  set  theory  (Zadeh65, 
Zadeh92).  Fuzzy  set  theory,  also  known  as  fuzzy  logic,  is  a  generalization  of  normal  set  theory 
(Schmucker84'.5).  Normal  set  theory  defines  the  membership  of  an  element  in  a  set  as  a  Boolean 
predicate  (i.e.,  yes  or  no).  Fuzzy  set  theory  represents  the  membership  of  a  value  in  a  given  set  as 
a  possibility  distribution  (i.e.,  low  to  high)  (Rich9 1:246).  This  variation  allows  definition  of  a  set 
to  represent  an  abstract  concept  such  as  tall. 

In  a  normal  set,  a  discrete  element  is  defined  for  each  height  we  wish  to  represent.  This 
discreteness  presents  a  problem  if  the  question,  "Is  John  tall?"  is  posed  to  the  system.  In  normal  set 
theory,  it  is  very  difficult,  if  not  impossible,  to  adequately  represent  the  concept  tall.  In  fuzzy  set 
theory,  John  has  a  membership  value  associated  with  the  set  TALL.  John  will  also  have  a 
membership  value  associated  with  the  sets  MEDIUM  and  SHORT  where  TALL,  MEDIUM,  and 
SHORT  all  describe  the  height  characteristics  of  people.  TALL,  MEDIUM,  and  SHORT  are  not 
necessarily  disjoint  sets  and  may  overlap  as  shown  in  Figure  1.  Depending  on  the  definition  of  the 
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set  characteristics,  a  person  may  have  equal  membership  in  multiple  sets.  For  instance,  using  the 
sets  defined  in  Figure  1,  a  person  who  is  5.7S  feet  tall  would  have  the  same  membership  value  in 
the  sets  MEDIUM  and  TALL. 

Continuing  with  the  example  of  John's  height,  there  is  another  difference  between  Boolean 
and  fiizzy  set  theories.  If  John's  height  is  five  feet  eleven  inches,  and  in  our  Boolean  system  tall  has 
been  defined  to  be  those  persons  six  feet  and  over,  again  pose  the  question  "Is  John  tall?"  to  tlK 
system.  The  Boolean  system  would  reply  no,  although  most  human  observers  would  tend  to 
categorize  John  as  tall.  In  the  fuzzy  logic  system,  John's  membership  in  the  set  TALL  is  defined  as 
something  less  that  1.0,  but  much  greater  than  0.0;  probably  around  0.99.  The  same  question 
posed  to  a  fiizzy  logic  system  should  reply  that  John  is  a  member  of  the  set  TALL  with  a 
membership  value  of  0.99.  John  does  not  fully  belong  to  the  set  TALL,  but  TALL  would  be  a 
fairly  accurate  linguistic  term  to  describe  John's  height. 
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What  these  two  examples  deinonstrate  are  the  basic  concepts  exemplified  by  fuzzy  set 
theory:  the  concepts  of  partial  membership  within  a  single  set  and  membership  in  multiple  sets 
describing  the  same  attribute.  What  these  concepts  provide  is  a  fairly  easy  and  intuitive  mediod  to 
describe  uncertain  information. 

Fuzzy  set  theory  also  allows  modifiers  such  as  very,  somewhat,  and  slightly 
(Negoita85:75).  These  modifiers  can  be  defined  to  have  the  properties  of  concentrating,  dilating, 
or  shifting  the  primary  fuzzy  set  definition.  These  modifiers  allow  for  a  more  detailed 
discrimination  of  the  members  of  a  primary  fiizzy  set. 

As  pointed  out  earlier,  fuzzy  set  theory  is  a  generalization  of  normal  set  theory.  As  such, 
normal  sets  can  be  modeled  using  fiizzy  set  theory.  This  allows  for  the  use  of  precise  set 
definitions  for  those  data  items  that  are  precisely  defined  and  fiizzy  set  definitions  for  those  data 
items  that  are  ill-defined.  An  example  would  be  the  sets  MALE  and  FEMALE.  Most  people 
would  agree  that  these  sets  are  precisely  defined,  genetic  abnormalities  aside,  and  as  such 
constitute  Boolean  sets.  In  fiizzy  set  theory  is  it  perfectly  acceptable  to  pose  the  question,  "Is  John 
male  and  tall?". 

Fuzzy  set  theory  is  not  without  its  faults.  Elepending  on  how  the  set  combination  functions 
such  as  union  and  intersection  are  implemented,  the  internal  representations  of  the  results  can 
suffer  the  same  combinatorial  explosion  problems  as  the  aforementioned  methods.  There  are 
implementations,  such  as  the  table  lookup  method  presented  later  in  this  thesis,  which  eliminate  the 
combinatorial  problems. 

Fuzzy  set  theory,  as  applied  to  vulnerability  analysis,  still  requires  the  a  priori  definition 
of  the  probability,  or  likelihood  in  fiizzy  set  terms,  that  the  vulnerabilities  will  occur.  The  primary 
difference  with  assigning  these  likelihoods  in  fiizzy  set  theory  is  the  use  of  linguistic  terms.  The 
other  methods  require  that  the  probability  values  be  discrete,  numeric  values,  while  in  fuzzy  set 
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theory,  the  likelihood  indicates  a  range  of  values.  This  range,  represented  hy  a  linguistic  term, 
encompasses  the  inherent  imprecision  often  found  in  security  analysis  (Schmucker84;20). 

While  all  of  these  methods  have  similar  capabilities  and  drawbacks,  I  found  that  fuzzy  set 
theory  seems  to  provide  the  most  acceptable  method  for  modeling  and  analyzing  vulnerability  data. 
Fuzzy  set  theory,  at  least  in  concept,  can  use  natural  language  to  quantify  and  reason  about 
concepts  describing  ambiguous  characteristics  of  an  object  or  event  (Giarratano89:291).  The 
choice  of  fuzzy  set  theory  to  analyze  vulnerabilities  is  not  without  support  from  other  researchers. 
According  to  one  researcher  in  the  use  of  natural  language  for  risk  estimation,  the  increase  in 
accuracy  of  the  overall  estimates  by  using  natural  language  ranged  from  16%  to  32%  (NagySl). 
The  use  of  natural  language  values  helped  to  eliminate  the  extremely  inaccurate  estimates 
(Schmucker84:36). 

2.4  Analysis  Methods 

2.4.1  Introduction.  There  are  two  general  methods  to  analyze  vulnerabilities: 
quantitative  and  qualitative.  The  quantitative  method  involves  assigning  numeric  values  to  the 
attributes  of  the  vulnerabilities  and  then  using  statistical  and  probabilistic  techniques  to  evaluate 
the  vulnerability  of  a  system.  The  qualitative  method  involves  assigning  value  judgments,  also 
known  as  linguistic  values,  to  the  attributes  of  the  vulnerabilities  and  then  a  technique  such  as 
fuzzy  set  theory  is  used  to  evaluate  the  vulnerability  of  a  system. 

As  pointed  out  by  Wood,  et.al.,  security  experts  differ  in  opinion  as  to  which  method 
(quantitative  or  qualitative)  is  the  best  for  the  evaluation  of  computer  security  (Wood87:7)  and 
Schmucker  indicates  that  there  is  no  "established  or  standard"  way  to  perform  the  evaluation  of 
computer  security  (Schmucker:43).  The  reasons  for  this  are  threefold.  First,  many  experts  are 
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opposed  to  change.  If  an  expert  has  been  using  a  particular  method  throughout  his  career,  he  is  are 
not  likely  to  want  to  leam,  or  possibly  even  recognize,  another  method.  Second,  there  tends  to  be  a 
significant  cost  difference  between  the  two  methods  (Wood87:7).  This  cost  difference  can  be 
attributed  to  the  amount  of  information  required  for  each  method.  Third,  because  of  the  extremely 
subjective  nature  of  analyzing  computer  security  vulnerabilities,  no  two  experts  can  agree  on  the 
meaning  attributed  to  results  produced  by  any  particular  method. 

The  discussion  below  outlines  the  efforts  of  other  authors  to  develop  systems  capable  of 
dealing  with  this  uncertainty  in  performing  vulnerability  and  risk  analysis.  Most  of  these  systems 
are  risk  analysis  systems  where  vulnerability  analysis,  if  present,  is  a  sub-component  of  the  overall 
system.  There  is  no  definitive  method  to  perform  the  vulnerability  analysis  portion  of  a  risk 
analysis. 

Wood,  et.al,  advocates  the  use  of  a  weighted  average  of  all  vulnerabilities  (Wood87:12). 
Schmucker  also  uaes  a  weighted  averaging  scheme,  but  vulnerabilities  are  represented  by  a 
category  and  component  hierarchy  with  the  weighted  average  propagated  up  the  hierarchy 
(Schmucker84:47).  Hoffman  and  Neitzel  follow  an  approach  identical  to  Schmucker's 
(Hoffman80;366).  The  three  systems  above  all  use  an  estimate  of  the  probability  that  the 
vulnerability  will  occur. 

Wong  advocates  a  system  that  uses  past  historical  data  to  determine  the  frequency  that  a 
vulnerability  occurs  and  modifies  this  frequency  with  a  weighting  factor  to  try  to  predict  when  the 
vulnerability  might  occur  in  the  future  (Wong77:98).  For  vulnerabilities  with  no  historical  data, 
Wong  recommends  using  a  statistical  survey  (Wong77:101).  The  last  system  identified  is  one 
proposed  by  Carroll.  Carroll’s  system  is  business  oriented  and  calculates  the  annual  loss 
expectancy  and  return  on  investing  in  security  measures  (CarToll84:5). 
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2.4.2  Quantitative  Analysis.  Three  of  the  identified  risk  analysis  methods  are  based  on 
quantitative  analysis:  Wong,  Carroll,  and  Wood.  Wong's  method  is  based  on  the  "loss  unit 
concept"  and  is  shown  in  Figure  2.  Wong  uses  the  term  risk  interchangeably  with  vulnerability. 
Once  values  have  been  derived  for  all  of  the  frequencies  of  occurrence  (F)  and  the  range  of 
consequential  loss  (L,  L'),  the  sum  of  the  products  (L  Fj  *  Lj)  is  calculated  to  produce  the  expected 
loss  while  the  sum  of  the  products  (I  Fj  *  Lj')  is  calculated  to  produce  the  maximum  expected  loss. 
To  predict  future  losses,  a  weighing  factor  is  multiplied  to  each  Fj,  Lj,  and  Lj'.  These  weighting 
factors  relate  past  frequencies  of  occurrence  and  consequential  losses  to  predicted  future  values 
(Wong77:98).  This  method  makes  no  attempt  to  analyze  the  vulnerabilities,  what  Wong  calls 
risks,  for  importance  or  impact.  The  only  measure  used  is  the  frequency  of  occurrence. 


Figure  2.  Loss  Unit  Concept  (Wong77:94). 

Carroll's  risk  analysis  method  is  based  on  calculating  the  return  on  investment  given  by 
placing  security  measures  in  place.  This  return  on  investment  is  calculated  by  computing  the 
difference  in  annual  loss  expectancy  without  security  measure  and  the  annual  loss  expectancy  with 
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security  measures.  The  annual  cost  of  security  measures  is  subtracted  from  this  difference  to 
provide  the  return  on  investment  (Carroll84:5).  One  of  the  values  used  by  Carroll  to  calculate  the 
annual  loss  expectancy  is  a  vulnerability  rating.  This  rating  is  calculated  by  assigning  a  subjective 
value  from  1  to  S  representing  the  severity  of  a  vulnerability.  1111$  value  is  then  converted  using  a 
logarithmic  scale  to  a  value  between  0  and  1.  Carroll  contends,  but  provides  no  supporting 
references,  that  "human  judgment  tends  to  be  quite  accurate  at  the  lower  end  of  a  subjective  scale 
but  not  so  good  at  the  upper  end."  (Carroll84:91). 

The  last  quantitative  risk  analysis  method  presented  is  Wood,  et.al.  Wood,  et.ai,  have 
identified  the  most  prevalent  vulnerabilities  present  in  most  computer  systems  and  provide  these 
vulnerabilities  in  a  check-list  format.  In  this  method,  each  vulnerability  present  in  a  system  is 
scored  with  a  numeric  value  of  0.9,  0.7,  0.5,  0.3,  or  0.1.  These  values  represent  the  linguistic 
equivalent  of  Very  High,  High,  Medium,  Low,  and  Very  Low  respectively.  Once  the  check-hst  has 
been  completed,  the  ratio  of  vulnerabilities  present  to  the  vulnerabilities  listed  in  the  check-list  is 
calculated  for  each  possible  score.  A  weighted  fraction  is  then  calculated,  as  shown  in  Equation 
(2),  by  summing  the  products  of  the  ratios  with  their  respective  scores  and  dividing  by  the  sum  of 
the  scores. 


FW  = 


NR{0,9) 

N{0.9) 


x0.9  + 


NR{0.7) 

NjOJ) 


X0.7+-+ 


NR{0A) 

A/(0.1) 


xO.1 


0.9  +  0.7+0.5  +  0.3  +  0.1 


(2) 


where:  NR(x)  =  number  of  relevant  vulnerabilities  for  score  x 

N(x)  =  number  of  vulnerabilities  in  check-list  for  score  x  (Wood87:12) 


This  reciprocal  of  this  weighted  fraction  is  calculated  and  is  referred  to  by  Wood  as  the 
applicability  index.  A  non-adjusted  score  is  calculated  by  taking  the  sum  of  the  products  of  the 
number  of  relevant  vulnerabilities  and  their  respective  scores  (i.e.,  NR(0.9)*0.9  +  NR(0.7)*0.7  + 
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...  +  NR(0.1)*0.1).  A  maximum  possible  score  is  calculated  by  taking  the  sum  of  the  products  of 
the  number  vulnerabilities  in  the  check-list  for  each  possible  score  and  their  respective  scores  (i.e., 
N(0.9)*0.9  +  N(0.7)*0.7  +  ...  +  N(0.1)*0.1).  The  adjusted  score  is  confuted  as  the  product  of  the 
non-adjusted  score  and  the  applicability  index.  The  fmal  calculation  produces  a  what  Wood  calls  a 
control  comprehensiveness  indicator  and  is  the  ratio  of  the  adjusted  score  and  the  maximum 
possible  score.  This  values  is  to  be  used  as  an  indicator  of  how  well  the  system  security  measures 
perform  with  respect  to  the  vulnerabilities  identified  in  the  checklist  (Wood87:13-16). 

As  an  overview  of  the  quantitative  methods  presented,  a  few  of  the  benefits  and  detractors 
of  these  methods  should  be  mentioned.  First,  significant  effort  must  be  put  into  acquiring  the 
numerical  probability  that  each  vulnerability  will  occur.  Also  if  any  of  the  vulnerabilities  are 
dependent  on  each  other,  the  dependent  probabilities  must  be  determined.  Obviously,  this  requires 
an  extensive  statistical  database  that  covers  all  of  the  vulnerability  combinations  (Wong77:83). 
The  quantitative  methods  also  suffer  when  this  statistical  information  is  not  available  for  a  given 
vulnerability.  In  fact,  when  no  statistical  information  is  available,  the  assignment  of  the 
probabilities  of  occurrence  becomes  subjective(Wong77:101).  In  other  words,  the  expert,  in  the 
absence  of  statistical  data,  makes  a  qualitative  value  assignment  and  translates  that  value  into  a 
numeric  value.  If  a  con:q)rehensive  statistical  database  is  available,  the  quantitative  methods  tend 
to  remove  the  subjective  biases  induced  through  estimation  (Wong77:84). 

2.4.3  Qualitative  Analysis.  In  qualitative  methods,  value  judgments  are  associated  with 
each  vulnerability.  These  value  judgments  are  often  based  on  the  instinct,  intuition,  and  experience 
of  the  expert  performing  the  evaluation,  but  may  include  some  statistical  bias.  For  instance,  if  an 
expert  knows  from  experience  that  users  often  use  conunon  words  for  passwords,  then  the 
vulnerability  caused  by  not  changing  passwords  frequently  may  have  a  higher  importance  than 
users  writing  down  their  passwords.  If  on  the  other  hand,  the  expert  knows  that  the  system 
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generates  random  passwords,  the  vulnerability  of  users  writing  down  their  passwords  would  be 
higher  than  that  of  infrequent  password  changes. 

The  two  remaining  risk  analysis  systems  identified  are  based  cm  qualitative  analysis 
methods.  These  systems  are  identical  in  their  approach  in  that  they  use  fuzzy  logic  and  linguistic 
variables  to  perform  the  analysis.  Both  the  Schmucker  system  and  the  Hoffman  and  Neitzel 
system  use  a  fuzzy  weighted  average.  Each  vulnerability  is  assigned  three  linguistic  values  to 
represent  possibility  of  loss,  severity  of  loss,  and  reliability  of  estimate.  The  vulnerabilities  are 
represented  in  a  hierarchical  network  to  indicate  the  interdependence  of  vulnerabilities  and  also  to 
indicate  the  various  sub-components  of  the  system  being  analyzed.  For  each  parent  tHxle  in  the 
hierarchy,  the  weighted  average  of  the  children  nodes  is  calculated  and  assigned  as  the  risk  of  the 
parent  node.  This  process  is  repeated  with  the  weighted  averages  propagated  up  the  hierarchy  until 
a  single  risk  value  is  generated  for  the  top  iKxle  (Schmucker84:4S-47,  Hoffman80:370). 
Schmucker  goes  into  great  detail  as  to  how  to  perform  this  wei^ted  average  (Schmucker84;49-SS) 
while  Hoffman  and  Neitzel  simply  provide  a  conceptual  overview  (Hoffman80:370). 

2.5  Organization  of  Vulnerability  Infomuttion 

Regardless  of  whether  quantitative  or  qualitative  methods  are  used,  there  are  several 
formats  for  organizing  the  vulnerability  information.  Wong  advocates  the  use  of  a  top-down 
hierarchy  to  represent  the  entire  structure  of  vulnerabilities  (Wong77:7).  This  top-down  structure 
is  built  using  information  from  a  variety  of  sources  that  include  procedures,  conq)any  and  account 
information,  contract  information,  site  inspections,  interviews,  and  functional  flowcharts  (Wong77: 
43).  Combining  all  of  thc^«  sources,  the  analyst  is  able  to  generate  a  hierarchy  of  vulnerabilities. 
Schmucker  also  advocates  the  use  of  a  top-down  hierarchy  to  represent  vulnerabilities.  This 
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hierarchy  is  based  cm  the  decomposition  of  the  con^niter  system  into  components  and  dq>endait 
vulnerabilities  (Schmucker84:4S). 

Another  mediod  used  to  represent  vulnerabilities  is  a  check-list.  ARES  is  an  example  oi 
such  a  check-list  system.  In  this  type  of  system,  all  of  the  possible  vulnerabilities  are  known  and 
the  analyst  merely  indicates  the  presence  or  absence  of  each  vulnerability.  ARES  suffers  in  that 
the  importance  of  each  vulnerability  is  not  known  and  that  no  real  vulnerability  analysis  is 
performed.  Wood,  et.al.,  also  provide  a  check-list  approach.  Wood,  et.al.,  gei^rated  their  list  of 
vulnerabilities  based  on  woiic  performed  for  the  Lawrence  Livermore  National  Laboratory  and  the 
former  USAF  Logistics  Command.  This  checklist  contains  8S7  identified  vulnerabilities 
( Woods? :  14)  where  a  predetermined  importance  value  is  associated  with  each  vulnerability 
(Woods?:  10).  Carroll  also  provides  a  check-list  though  not  as  comprehensive  as  Wood,  et.al. 

2.6  Current  Air  Force  Analysis  Tools 

As  mentioned  earlier,  ARES  was  an  attempt  to  close  the  gap  in  computer  security  between 
the  number  of  risk  analyses  the  security  mangers  could  perform  and  the  number  of  systems 
requiring  analysis.  However,  as  with  any  computer  product,  users  identified  several  shortcomings. 
Foremost,  ARES  provided  no  method  to  evaluate  identified  vulnerabilities.  ARES  simply  provided 
a  list  of  vulnerabilities  to  the  CSO  partitioned  into  several  categories:  Audit  Trails,  Backup, 
Contingency  Plan,  Documentation,  Information  Access  Control,  Magnetic  Remanence  (i.e., 
traces).  Media  Storage  and  Control,  Operating  System,  Passwords,  Physical  Security,  Small 
Computers,  Software  Configuration  Management,  and  Security  Test  and  Evaluation. 

Within  a  category,  ARES  did  not  differentiate  between  vulnerabilities  based  on  their 
potential  impact  to  the  system  security.  For  example,  ARES  did  not  differentiate  between  a  door 
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with  no  lock  in  an  unclassified  environment  and  a  system  i»ocessing  Top  Secret  data  with  no 
passwords.  Both  were  listed  as  vulnerabilities  under  Access  Control.  It  was  up  to  the  CSO  to 
determine  which  vulnerability  was  more  important.  Because  of  this  failing,  a  CSO  may  overiook 
serious  flaws  in  a  system  if  many  minor  flaws  are  mixed  in. 

To  aid  in  fixing  this  shortcoming,  AFTT  proposed  a  long  term  multi-phase  project.  The 
first  phase  of  the  project  identifies,  evaluates,  and  develr^  methods  to  evaluate  computer  security 
vulnerabilities.  Other  phases  will  deal  with  implementing  and  integrating  the  methods  into  ARES 
or  a  similar  program  and  expanding  the  use  of  automated  methods  into  performing  the  valuation 
and  risk  assessment  of  the  overall  system. 

2. 7  Automating  Computer  Security  Analysis 

There  are  three  basic  problems  hampering  the  automation  of  computer  security  analysis: 
secrecy,  attitudes  about  security,  and  chan^g  technology.  Computer  security  analysis  is  often 
shrouded  in  secrecy.  The  reasons  for  this  secrecy  are  threefold.  First,  no  computer  facility  is 
absolutely  secure.  Consequently,  no  security  analysts  will  reveal  what  vulnerabilities  exist  at  their 
site.  Second,  when  a  security  analyst  develops  a  method  to  determine  the  level  of  security  at  his 
site,  he  often  will  not  publish  the  results.  This  is  to  prevent  the  security  analyst's  opponents  from 
using  the  method  to  determine  his  security  vulnerabilities.  Last,  security  analysis,  at  least  in  the 
past,  can  be  considered  a  "black  art".  A  security  analyst  often  makes  decisions  about  the  relative 
security  of  a  site  based  on  instinct,  intuition,  and  experience  rather  than  rules  and  formulas.  Given 
this  subjective  analysis,  it  is  very  difficult  to  automate  security  analysis. 

Another  reason  computer  security  analysis  has  not  been  successfully  automated  is  the 
attitude  about  security  in  general.  In  the  research  community,  computer  security  is  often  viewed  as 
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a  hindrance.  Users  often  want  to  share  their  findings  with  many  cdkagues  and  thus  will  try  to 
circumvent  safeguards  to  allow  freer  access  to  their  data  and  system.  Users  also  tend  to  have  very 
myopic  views  about  the  scope  of  security.  They  may  intend  to  only  allow  access  to  their  files  by 
selected  colleagues  when  in  fact  they  open  up  the  entire  system  to  everyone.  At  most  sites,  the  only 
personnel  who  are  truly  security  conscious  are  the  system  administrators  and  CSO. 

The  last  hurdle  in  trying  to  automate  computer  security  is  changing  technology.  Twenty 
years  ago,  most  computers  were  large  mainframes  with  many  terminals  connected.  CXitside  access 
to  the  systems  was  minimal  if  existent.  Today,  computers  are  connected  woridwide  via  high  speed 
networks.  With  the  speed  of  these  networics  reaching  100  megabytes  per  second  transfer  rate,  it 
only  takes  a  few  seconds  for  large  amounts  of  data  to  be  compromised. 

The  other  change  in  technology  is  die  proliferatirai  of  personal  computers.  As  of  1988, 
over  4S  million  personal  computers  were  in  use.  In  that  same  year,  persrmal  computers  made  up 
92%  of  the  total  computers  shipped  by  US  manufacturers.  For  the  five  year  span  of  1987  through 
1991,  US  manufactures  shipped  almost  32  million  computer  systems;  personal  computers 
accounting  for  29.S  million  of  the  systems.  The  government  alone  had  a  ten  fold  increase  in  the 
use  of  personal  computers.  Also  of  interest  was  the  increase  in  the  number  of  workstations  sold 
during  this  S  year  period:  while  mainframe  sales  decreased  by  62%,  workstation  sales  increased  by 
360%.  (Census93:Tables  648, 1273,  and  1274) 

What  this  implies  is  that  methodologies  developed  to  ensure  the  security  of  large 
mainframes  with  dedicated  terminals  may  not  necessarily  work  for  large  distributed  networks  with 
many  individual  personal  computers  and  workstations. 

Given  these  problems,  the  need  for  automating  computer  security  has  never  been  greater. 
Computer  security  analysts  need  effective  and  reliable  tools  to  help  diem  evaluate  the  risks  and 
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vulnerabilities  associated  with  con^aiters.  They  need  tools  th^  are  adaptable  to  new  technologies 
and  scalable  to  handle  the  increasing  number  and  types  of  systems  requiring  evaluation. 

2.S  Summary 

There  is  a  definite  need  to  perform  timely  and  accurate  evaluation  of  computer  security 
vulnerabilities.  The  ability  to  handle  uncertain  information  is  paramount  in  perfnming  this 
evaluation.  The  evaluation  method  should  not  require  an  extoisive  statistical  database  in  mxler  to 
establish  a  baseline  for  processing.  Because  of  increasing  numbers  of  computer  systems  and  thus 
the  required  number  of  evaluations  to  be  performed,  the  evaluation  {process  needs  to  be  automated. 

Much  enq)hasis  is  placed  <mi  performing  risk  analysis  of  computer  systems,  but  little 
emphasis  is  place  (xi  vulnerability  analysis.  As  cxie  author  stated,  "The  process  of  risk  analysis 
centers  on  vulnerabilities"  (Cartoll84:87). 

Of  the  tools  available  to  assist  in  automating  the  evaluation  of  computer  security 
vulnerabilities,  those  AI  tools  with  the  capacity  to  reascxi  with  uncertainty  i^pear  to  hold  promise. 
There  are  many  AI  methods  that  provide  the  capability  to  evaluate  uncertain  information,  a  few  of 
which  have  been  presented  in  this  cluq)ter.  Of  these,  I  find  fuzzy  set  theory  the  nwst  interesting 
approach  and  it  appears  to  have  promise  for  dealing  with  the  inherent  inq>recision  found  in  security 
analysis.  Fuzzy  set  theory  has  the  capability  to  describe  and  maintain  the  relationship  between  two 
facts,  whether  well-  or  ill-defined,  and  through  the  use  of  linguistic  variables,  provides  an  intuitive 
(to  the  author)  language-based  interface  between  the  system  and  the  security  analyst. 


26 


m  Methodok^ 


3.1  Introduction 

The  methods  used  to  evaluate  computer  security  vulnerabilities  can  be  divided  into  two 
main  types:  quantitative  and  qualitative.  Quantitative  methods  involve  assigning  a  numerical  value 
to  each  criterion  under  consideration  and  usually  involve  probabilistic  or  statistical  analysis. 
Qualitative  methods  assign  a  linguistic  value  to  each  criterion.  These  linguistic  values  include 
terms  such  as  high,  low,  likely,  possible,  and  never.  A  qualitative  analysis  method  that  uses 
linguistic  terms  is  fiizzy  logic.  With  either  approach,  the  assignment  of  a  value,  whether  a  numeric 
quantity  or  linguistic  quality,  is  subjective  in  the  absence  of  an  extensive  statistical  database. 

3.2  Overview 

This  chapter  outlines  the  methodology  used  to  develop  an  AI  method  to  assist  in  the 
evaluation  of  computer  security  vulnerabilities.  It  is  my  contention  that  a  qualitative  approach 
will  provide  comparable  results:  it  is  computationally  feasible,  theoretically  sound,  scalable,  easier 
to  use,  and  more  intuitive  to  the  user.  The  main  result  of  this  research  is  the  demonstration  of  the 
feasibility  of  using  a  qualitative  analysis  method  to  evaluate  computer  security  vulnerabilities. 

In  order  to  demonstrate  the  feasibility  of  using  a  qualitative  analysis  method  to  evaluate 
vulnerabilities,  a  comparable  quantitative  analysis  method  had  to  be  identified  or  develcqted.  As 
discussed  in  chapter  2,  the  available  research  focused  on  risk  analysis  methods,  not  vulnerability 
analysis.  Where  vulnerability  analysis  was  mentioned,  it  involved  the  subjective  assignment  of 
values  to  the  vulnerabilities.  Not  finding  any  existing  vulnerability  analysis  system,  I  develq)ed 


my  own.  Without  any  industry  standard  to  benchmark  my  developmmt  against,  die  methods  I 
developed  and  present  here  are  based  on  the  intuition  and  experience  I  developed  as  a  computer 
systems  analyst  for  and  manager  of  a  very  large,  secure  data  processing  facility. 

To  ensure  the  vulnerability  analysis  system  I  develc^ied  wasn't  biased  towards  my 
hypothesis,  I  developed  the  quantitative  analysis  method  fust.  This  system  is  based  on  standard 
statistical  methods  of  analysis  for  independent  vulnerabilities.  Independence  of  vulnerabilities  is 
assumed  to  minimize  the  effects  of  combinatorial  explosion.  Althcaigh  in  the  real  worid, 
vulnerabilities  are  not  necessarily  independent,  I  felt  it  was  best  to  keep  the  systems  simple.  This 
simplicity  makes  the  results  easier  to  compare. 

I  then  implemented  a  qualitative  method  using  the  same  assumption  of  independence,  but 
using  fiizzy  set  theory  as  the  primary  analysis  method.  I  chose  fuzzy  set  theory  because  I  felt  it 
best  represented  a  fiilly  qualitative  approach.  I  based  my  initial  fiizzy  set  theory  implementation  on 
the  work  of  Schmucker,  but  discovered  that  this  method  suffers  from  combinatorial  explosion.  To 
overcome  this,  I  developed  an  alternative  table  lookup  method.  The  reasons  and  justification  for 
using  this  table  lookup  method  are  provided  later  in  this  chapter. 

One  concern  was  the  tqiplicability  of  this  research  to  existing  risk  analysis  systems.  As 
most  of  the  existing  systems  identified  sinqily  require  a  subjective  vulnerability  value,  I  contend 
that  a  systematic  analysis  of  identified  vulnerabilities  should  increase  the  accuracy  of  these  risk 
analysis  systems.  This  contention  is  made  based  on  intuition  as  opposed  to  empirical  evidence. 
Due  to  time  and  budgetary  constraints,  I  was  not  able  to  obtain  a  working  implementation  of  any  of 
the  mentioned  risk  analysis  systems  in  order  to  prove  my  contentions.  However,  I  feel  that  any 
systematic  approach  that  is  reasonable  is  better  than  a  subjective  guess. 
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3.3  Background 


There  are  two  inain  methods  I  used  in  this  thesis  to  evaluate  vulnerability  data.  The  furst  is 
a  quantitative  method  based  on  statistical  analysis.  The  second  is  a  qualitative  mediod  based  on 
fuz2y  logic  as  proposed  by  Zadeh  (Zadeh65:338).  Below  are  the  specifications  of  the  methods 
used. 

Each  method  implemented  used  a  subset  of  the  vulnerabilities  identified  by  trial  runs  of 
ARES.  This  data,  shown  in  Appendix  B,  was  only  used  as  a  demonstration  vehicle  for  each 
method  and  is  the  result  of  many  varied  runs  of  ARES.  The  probability  distributions  and  influence 
values  associated  with  the  identified  vulnerabilities  were  randomly  generated  and,  therefore,  the 
vulnerability  values  shown  do  not  represent  an  actual  system.  However,  the  data  is  representative 
of  a  possible  system. 

3.4  StaHstical  Analysis. 

The  statistical  analysis  method  used  is  straight-forward  and  should  be  familiar  to  most 
readers.  The  main  text  used  for  this  analysis  mediod  is  "Probability  and  Statistics  for  Engineers" 
by  Scheaffer  and  McClave  (Scheaffer86;l).  For  each  vulnerability,  eight  values  are  given.  TTie 
first  value,  'vuln-inf  luence',  given  with  each  vulnerability  indicates  the  overall  influence  of  that 
particular  vulnerability  across  all  seven  functional  areas.  This  value,  between  0  and  1,  represents 
the  subjective  rating  of  importance  of  this  vulnerability  with  respect  to  the  overall  vulnerability  of 
the  system.  The  rating  value  given  assumes  that  die  vulnerability  is  independent  from  all  other 
vulnerabilities.  The  other  seven  values,  indicated  by  the  prefix  'dist-',  represent  the  allotment  of 
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this  vulnerability's  influence  to  each  of  the  defined  fimctional  areas  presented  in  the  previous 
chapter.  Each  of  these  allotment  values  range  from  0  to  1  and  indicate  how  much  of  this 
vulnerability's  influence  value  is  applied  to  a  specific  functional  area.  These  allotment  values  can 
also  be  thought  of  as  an  impact  rating  indicating  the  degree  to  which  this  vulnerability  impacts  a 
particular  functional  area. 

Two  separate  processes  are  performed  on  the  data.  The  first  is  a  statistical  analysis  for 
each  functional  area.  For  each  functional  area,  the  total  influence  given,  the  weighted  average  of 
influence  given,  the  standard  deviation  of  influence,  and  the  percentage  of  influence  are  calculated. 
The  product  of  the  influence  value  and  the  functional  area  distribution  value  is  used  as  the 
contribution  by  each  vulnerability.  The  contributions  of  all  vulnerabilities  to  a  specific  fimctional 
area  are  summed  to  generate  the  total  influence  given  to  that  functional  area.  This  sum  is  then 
divided  by  the  number  of  vulnerabilities  contributing  to  produce  the  weighted  average  of  influence 
given.  The  standard  deviation  of  influence  is  calculated  using  the  contributions  of  each 
vulnerability  for  a  specific  functional  area.  The  percentage  of  influence  is  calculated  by  taking  the 
total  influence  given  for  a  functional  area  and  dividing  by  the  sum  of  the  total  influence  given  for 
all  areas. 

The  second  process  performed  on  the  data  involves  identifying  those  vulnerabilities  that 
contribute  significantly  to  each  functional  area.  This  is  done  by  identifying  and  listing  those 
vulnerabilities  that  are  in  or  matching  the  top  10%  of  the  contributors  (i.e.,  if  50  vulnerabilities  are 
in  the  system,  then  at  least  the  top  5  contributors,  but  maybe  more  depending  on  tying  conditions). 
Tying  conditions  prevent  the  strict  application  of  a  10%  cutoff.  With  no  other  information 
available  other  than  contribution,  it  is  not  reasonable  to  discriminate  against  a  vulnerability  if  its 
contribution  value  ties  with  one  in  the  top  10%. 
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Also  identified  are  those  vulnerabiliti'^s  th^  are  in  or  matching  the  top  \U%  of  more  than 
one  fimctional  area.  It  should  be  apparent  that  the  vulner^ilities  that  occur  in  or  match  the  top 
10%  of  the  nwst  functional  areas  are  the  most  critical.  The  purpose  of  this  is  to  identify  to  the 
computer  security  expert  those  vulnerabilities  that  should  be  addressed  first. 

Groupings  of  vulnerabilities  is  also  performed  based  on  units  of  standard  deviation  as 
measured  from  the  maximum  contributor.  In  other  words,  those  vulnerabilities  within  1  standard 
deviation,  2  standard  deviations,  and  so  forth.  The  use  of  the  standard  deviation  is  an  arbitrary 
choice,  but  does  provide  some  sense  as  to  how  the  vulnerabilities  could  be  grouped. 

3.5  Fuzzy  Logic. 

For  the  fiizzy  logic  analysis,  two  approaches  were  implemented.  The  first  was  a 
calculation  method  based  on  the  method  implemented  by  Schmucker  and  the  second  approximated 
these  calculations  using  a  table  lookup. 

In  both  methods,  it  is  impossible  to  take  a  weighted  average  of  linguistic  values  since  the 
definition  of  weighted  average  implies  division  by  the  number  of  terms  in  the  summation.  For 
fuzzy  logic,  it  is  more  appropriate  to  use  a  normalized  average.  In  the  statistical  analysis  method, 
the  sum  of  the  contributions  to  a  functional  area  was  divided  by  the  number  of  vuberabilities 
contributing  to  that  functional  area  to  produce  a  weighted  average  of  influence.  This  was  possible 
since  the  maximum  contribution  of  any  one  vulnerability  to  any  functional  area  was  one  and  the 
sum  of  these  maximum  contribution  equals  the  number  of  vulnerabilities.  In  the  fuzzy  logic 
analysis,  this  weighted  average  is  simulated  by  normalizing  the  sum  of  contributions  by  the  sum  of 
the  influence  values  for  each  vulnerability.  The  sum  of  influence  values  in  the  fuzzy  logic  analysis 
method  is  equivalent  to  the  sum  of  the  maximum  contributions  in  the  statistical  analysis  method. 
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As  in  the  statistical  analysis  method,  two  separate  processes  are  performed  on  the  data. 
The  first  provides  the  total  influence  given  to  the  entire  system,  the  influence  given  to  each 
functional  area,  and  the  normalized  average  influence  given  each  functional  area.  Since  linguistic 
values  are  used,  it  is  not  possible  to  generate  a  standard  deviation  or  percent  of  influence  given. 

The  contribution  of  each  vulnerability  to  a  functional  area  is  again  the  product  of  the 
vulnerability's  influence  value  and  distribution  value.  The  sum  of  these  contributions  make  up  the 
total  influence  given  to  each  functional  area.  The  sum  of  the  total  influence  given  to  each 
functional  area  gives  the  total  influence  given  the  system.  The  normalized  average  influence  is 
calculated  as  the  total  influence  given  each  area  divided  by  the  total  influence  given  the  system. 

The  second  process  performed  on  the  data  is  again  the  identification  of  those 
vulnerabilities  that  provide  significant  contribution  to  each  functional  area.  The  same  top  10% 
criteria  is  applied  to  the  fuzzy  results  as  was  applied  to  the  statistical  results  in  order  to  identify 
critical  vulnerabilities.  Since  the  standard  deviation  cannot  be  calculated,  the  linguistic  values 
themselves  are  used  to  group  the  results.  What  follows  is  a  discussion  of  the  two  methods  used  to 
perform  the  fiizzy  arithmetic  and  a  discussion  of  fuzzy  arithmetic  in  general. 

3.5,1  Schmucker's  Calculation  Method.  In  the  first  approach  implemented  with 
fuzzy  logic,  the  calculations  were  based  on  Schmucker  (Schmucker84:48)  and  used  the  equations 
shown  in  Equation  (3)  for  fiizzy  arithmetic.  In  the  equations  the  notation  a(i)/i  represents  the  fiizzy 
element  i  in  the  set  with  membership  in  the  set  of  a(i).  Schmucker  explains  his  fuzzy  arithmetic 
equations  as  follows: 
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What  this  definition  means  con^tationally  is  that  to  compute  die  degree  of 
membership  of.  say,  8  in  A+B,  we  have  to  examine  all  of  the  possible  ways  that 
two  integers  (taken  from  the  set  { 1,  2.  3,  4,  S,  6,  7.  8,  9})  can  sum  to  8  and 
examine  the  degrees  of  membership  of  these  pairs.  Thus,  if  the  degree  of 
membership  of  8  in  A  +  B  was  x,  then  x  would  be  computed  as  follows; 

X  =  max{mn(a(l),  b(7)),  min(a(2),  b(6)),  min(a(3),  b(5)),  min(a(4),  b(4)), 
min(a(5),  b(3)),  min(a(6),  b(2)},  min(a(7),  b(]))}. 


Each  of  the  minimum  operations  computes  (me  of  the  degrees  of  membership  of  8 
in  the  set  A  +  B.  We  then  take  the  greatest  such  degree  of  membership  to  be  the 
degree  of  membership  of  8  (Schmucker84:48). 


A  =  {a(/)  /  / 1 1  ^  ^  n} 
B  =  {bij)/j  1 1<y<n} 


A+S  =  max 
A*  B  =  max 
AIB  =  max 


{min(a(/).b(y))}/[/f] 

{min(a(/),f)(y))}/[/f] 

{min(a(/),b(y))}/[/c] 


^<i,j<n,k  =  i  +  j 

1^/,y<n,k  =  /*y 
1</,y<n,k  =  /7y 


where:  ij,  and  k  are  ftizzy  set  indices 

n  is  the  number  dT  elements  used  to  describe  the  fuzzy  set 


(3) 


Here,  the  fuzzy  set  is  defined  over  n  elements.  Although  fuzzy  sets  can  be  defined  over 
continuous  functions,  it  is  much  easier  to  implement  using  a  discretized  set.  Ilie  foundation  for 
these  equations  is  based  on  Zadeh's  extension  principle  and  a  gcxxl  explanation  is  given  in 
Appendix  B  of  Schmucker's  book  (Schmucker84:133).  Similar  methods  for  fuzzy  arithmetic  are 
outlined  by  Kaufmann  and  Gupta  (Kaufmann8S:14) 

Before  continuing,  an  example  of  how  these  equations  operate  is  beneficial.  For 
simplicity's  sake,  we  will  define  three  sets  with  three  discrete  points. 

ONE  =  {1/1, 0/2,  0/3} 

TWO  =  {0/1,  1/2,  0/3} 

THREE  =  {0/1,  0/2, 1/3} 

If  we  add  ONE  and  TWO  we  get  the  following: 

ONE  +  TWO  =  {0/2, 1/3,  0/4,  0/3,  0/4,  0/5,  0/4,  0/5,  0/6} 
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and  multiplication  results  in: 

ONE  *  TWO  =  {  0/1 ,  1/2,  0/3,  0/2,  0/4,  0/6,  0/3,  0/6,  0/9} 

It  should  be  noted  that  missing  elements  are  assumed  to  have  a  membership  of  zero  and  for 
terms  with  multiple  indices,  the  maximum  membership  value  is  used.  Therefore,  the  resulting  sets, 
after  correcting  the  terms  are  as  follows; 

ONE  +  TWO  =  {0/1 , 0/2, 1/3,  0/4,  0/5  0/6} 

ONE  •  TWO  =  {0/1 ,  1/2,  0/3,  0/4,  0/5,  0/6,  0/7,  0/8,  0/9} 

Based  on  the  equation  above,  the  result  of  a  division  is  the  following. 

TWO  /  ONE  =  {0/1 ,  0/0.5,  0/0.33,  1/2,  0/1 ,  0/0.67,  0/3,  0/1 .5,  0/1 } 

Division  is  used  by  Schmucker  to  produce  a  wei^ted  "average"  of  the  influence  in  his 
vulnerability  analysis.  To  simplify  the  results  of  fuzzy  set  division,  Schmucker  uses  a  method 
proposed  by  Clements.  This  method  places  in  the  set  resulting  from  the  fuzzy  division,  only  those 
terms  i  and  j,  which  when  i  is  divided  by  j  result  in  an  integer.  The  exanq)le  below  demonstrates 
this  simplification. 

TWO  /  ONE  =  {0/1 , 1/2,  0/1 , 0/3,  0/1 } 

As  before,  should  multiple  fuzzy  set  indices  occur,  the  index  with  the  maximum 
membership  is  used  in  the  final  normalization  of  results.  This  results  in  the  following: 

TWO /ONE  =  {0/1, 1/2,  0/3} 

The  major  drawback  of  implementing  Schmucker's  method  is  how  the  fuzzy  sets  expand. 
For  instance,  if  SO  fuzzy  sets,  each  defined  over  7  terms,  are  added,  the  resulting  set  is  defined  over 
350  terms.  The  expansion  problem  is  greatly  exacerbated  for  multiplication.  If  the  same  50  sets 
are  multiplied  together,  the  resulting  set  contains  7^®  or  almost  1.8  x  10^^  terms. 

This  is  clearly  not  feasible  for  vulnerability  analysis  where  the  possibility  exists  for  many 
hundreds  of  vulnerabilities.  Even  using  the  weighted  averaging  function  defined  by  Schmucker, 
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which  will  return  a  final  result  fuzzy  set  with  the  same  nundwr  of  terms  as  the  primary  sets 
(Schmucker84:49),  the  intermediate  calculations  within  the  averaging  process  tend  to  make  the 
fuzzy  set  calculations  intractable. 

An  example  demonstrates  this  very  quickly.  Assume  50  identified  vulnerabilities,  each 
vulnerability  distributed  over  7  functional  areas  and  containing  a  single  vulnerability  influence 
value.  To  calculate  the  weighted  average  for  a  single  functional  area  would  require  that  the 
numerator  of  the  weighted  average  function  contain  a  fiizzy  set  definition  with  2450  elements  and 
denominator  contain  350.  This  assumes  that  the  primary  fuzzy  set  is  defined  over  7  elements.  The 
numerator  consist.,  of  adding  50  49-element  sets.  Hie  49-element  sets  are  generated  by  multiplying 
2  7-element  primary  terms  (the  distribution  value  for  a  functional  area  and  the  influence  provided 
by  that  vulnerability).  The  denominator  is  the  sum  of  50  7-element  sets. 

For  each  iteration  through  the  equation,  i.e.,  for  each  i  and  j,  two  operations  are  required. 
The  first  is  the  mathematical  operation  on  the  index  and  the  second  is  the  comparison  for  the 
minimum  value  at  a(i)  and  b(j).  Each  multiplication  of  the  2  7-element  sets  requires  98  operations. 
The  first  addition  of  2  49-element  sets  requires  4802  operations  (49  indices  in  first  set,  49  indices 
in  second  set,  2  operations  per  index)  resulting  in  a  98-element  set  after  normalization.  For  the 
sake  of  simplicity,  we  will  assume  normalization  requires  zero  operations.  The  second  addition 
requires  9604  operations;  the  third  requires  19208  operations  and  so  forth.  For  this  example,  the 
number  of  operations  required  for  each  successive  addition  grows  on  the  order  of  2"~^.  Clearly, 
this  is  not  computationally  feasible  for  any  large  number  of  vulnerabilities. 

3.5.1.1  Translation  of  Fuzzy  Set  to  Linguistic  Term.  After  a  set  has  been 
normalized,  it  is  'translated'  to  a  linguistic  term.  In  this  case,  a  "best  fit"  ^proach  is  used  to 
translate  a  fuzzy  set  to  a  linguistic  term.  The  equation  to  do  this  best  fit  is  given  in  Schmucker 
(Schmucker84:56)  and  is  shown  in  Equation  (4). 
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The  actual  translation  occurs  when  the  Euclidean  distance  is  calculated  for  each  of  the  pre¬ 
defined  fuzzy  terms.  Hie  fuzzy  term  with  the  smallest  distance  from  the  set  of  interest  is 
considered  the  "best  fit". 


where:  X  is  fiizzy  set  to  be  translated 

F  is  ftizzy  set  representing  a  pre-defined  linguistic  term 

There  is  an  ambiguity  with  this  method.  The  problem  occurs  when  the  Euclidean  distance 
is  the  same  for  two  or  more  fuzzy  sets.  As  inqilemented,  the  system  will  choose  the  linguistic  term 
with  the  'lowest'  relative  value.  This  in  part  is  to  prevent  the  system  from  suffering  combinatorial 
explosion. 

3.5. 1.2  Normalization  and  Convexity  of  Fuzzy  Sets.  Hiis  implementation 
made  use  of  normalized  and  convex  sets.  The  use  of  normalized  and  convex  fuzzy  sets  aids  in 
'translating'  a  fuzzy  set  back  to  a  linguistic  term  as  shown  above. 

A  normal  fuzzy  set  is  where  the  element(s)  of  the  set  with  the  maximum  membership  value 
has  (have)  a  membership  value  of  one.  A  fuzzy  set  A  is  normal  if  and  only  if 

Va:  € /?;max  11 .  (x)  =  1 

X  ^ 

where  p./^(x)  represents  the  membership  value  of  element  x  in  fuzzy  set  A 
(Kaufmaim8S:12). 

A  convex  fuzzy  set  is  where  if  the  set  was  plotted,  it  would  have  at  most  one  positive  slope 
and  at  most  one  negative  slope.  Note  that  the  plot  of  the  set  is  not  required  to  have  either  (e.g.,  a 
horizontal  line),  or  may  have  only  a  single  positive  or  a  single  negative  sloping  line,  or  may  have 
both,  but  it  cannot  have  mote  than  one  positive  or  more  than  one  negative  sloping  line  if  the  fuzzy 
set  is  to  be  convex. 
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A  fuzzy  set  A  is  convex  if  and  only  if. 


Vjc.y  e  +  (1  -  X)y]  >  n^(jc)  a  VX  €  [0,1]. 

where  ^  M’AO')  represent  the  membership  values  of  elements  x  and  y  respectively 
in  fiizzy  set  A  (KaufmannSS:  11).  The  purpose  of  making  a  set  convex  is  to  prevent  conditions  like 
High  and  Low  from  occurring  simultaneously.  Of  course,  this  could  be  represented  by  Not 
Medium,  but  that  is  an  in:q)lementation  choice. 

I  did  however,  make  use  of  convex  sets  in  my  implementatic«  of  Schmucker’s  fuzzy 
arithmetic.  This  was  particularly  important  for  the  fiizzy  multiplication  of  two  seven-elemmt 
fuzzy  sets  that  returned  a  set  of  49  terms.  These  sets  usually  had  many  peaks  and  valleys. 
E)epending  on  where  these  peaks  and  valleys  fell,  the  translation  back  to  a  linguistic  term  would 
produce  unreliable  results.  For  instance,  Very  High  *  Very  High  would  return  Very  Low.  This  is 
because  once  the  set  was  m^ped  back  to  a  7-eIenient  set,  the  value  at  element  1  would  be  greater 
than  any  other  value.  By  making  the  49-element  set  convex  before  mapping  back  to  a  7-element 
set,  this  eliminated  that  problem. 

There  is  a  concern  that  normalizing  a  fuzzy  set  and  making  a  fuzzy  set  convex  will  change 
the  meaning  of  the  fiizzy  set  prior  to  these  operations  being  performed.  The  reason  these 
operations  are  used,  even  though  they  may  change  the  meaning  of  the  original  fiizzy  set,  is  because 
without  ensuring  the  convexity  and  normalization  of  the  fiizzy  set  it  is  extremely  difficult  to  find  a 
linguistic  expression  to  represent  the  original  fuzzy  set.  As  pointed  out  above,  the  use  of  convexity 
and  normalization  prevent  the  case  of  a  fiizzy  set  being  described  as  both  High  and  Low  but  not 
Medium. 

3.5.2  Table  Lookup  Method.  The  second  approach  implemented  using  fiizzy  logic 
attempted  to  overcome  the  combinatorial  explosion.  To  do  this,  a  table  lookup  method  was 
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implemented  to  perform  the  fuzzy  mathematical  operations.  The  values  for  the  addition  and 
multiplication  fuzzy  function  tables  are  shown  in  Tables  1  and  2  starting  on  page  S 1 .  These  tables 
were  derived  using  a  method  that  attenq)ted  to  model  the  behavicH'  of  the  equivalent  numNic 
functions.  The  behavior  of  the  fiuiction  is  such  that  adding  a  small  (relative  magnitude  between  0 
and  1)  number  to  a  small  number  produces  as  small  number.  Likewise,  adding  a  large  number  to  a 
large  number  produces  a  large  number. 

The  translation  of  the  numeric  value  into  its  equivalent  linguistic  term  was  accomplished 
using  both  a  linear  mapping  and  a  non-linear  m£q)ping.  The  linear  mapping  was  produced  by 
dividing  the  range  of  0  to  1  into  even  groups.  In  this  case,  I  had  seven  linguistic  terms,  so  each 
group  equaled  one-seventh.  Any  value  between  0  and  1/7  was  assigned  to  VERYLOW,  any  value 
between  1/7  and  2/7  was  assigned  to  LOW,  and  so  forth. 

The  non-linear  mapping  was  based  on  the  fuzzy  distributions  defined  in  Figure  3.  These 
fuzzy  sets  were  arbitrarily  chosen  with  the  intent  to  build  a  model  with  a  large  middle  and  small 
extremes.  The  goal  of  building  this  model  was  simply  to  test  the  effectiveness  of  a  table  lookup 
scheme  using  a  non-uniform  distribution  mapping.  Here,  the  assignment  occurs  to  those  linguistic 
values  where  the  numeric  value  has  a  maximum  membership.  For  instance,  0.03  would  nuq>  to 
VERYLOW,  but  0.04  would  map  to  LOW. 

To  actually  build  the  table,  a  simple  program  was  built  with  two  loops  that  iterated  from  0 
to  1  in  increments  of  0.001.  The  value  of  each  loop  was  translated  to  a  linguistic  term  in  order  to 
determine  the  fuzzy  set  these  values  were  in.  The  indicated  operation  was  performed  on  the 
numeric  values  and  the  result  was  translated  to  a  linguistic  term.  By  keeping  track  of  how  many  of 
each  linguistic  result  occurred  given  the  linguistic  input  values,  and  using  the  linguistic  result  with 
the  maximum  occurrences,  the  function  behavior  was  mapped. 
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Figures.  Defined  Fuzzy  Sets. 


For  instance,  in  building  the  linear  mi^ping  multiplication  table,  the  input  values  mi^t  be 
0.5  and  0.2.  The  0.5  would  map  to  MEDIUM  and  the  0.2  to  LOW.  The  product  of  0.5  and  0.2, 
0. 1  maps  to  VERYLOW.  However,  if  the  inputs  were  0.56  and  0.28,  again  mapping  to  MEDIUM 
and  LOW  respectively,  the  result,  0.1568  maps  to  LOW.  This  is  an  example  of  the  boundary 
condition  caused  by  nu^ping  an  infinite  sequence  of  values  onto  a  finite  map.  As  is  turns  out  in 
this  example,  with  the  loops  providing  a  1000  values  from  0  to  1,  there  are  19,138  VERYLOW 
products  of  MEDIUM  and  LOW  and  1,311  LOW  products.  Because  of  the  preponderance  of 
VERYLOW  results,  that  value  is  used  to  represent  the  result  of  multiplying  MEDIUM  and  LOW. 
It  should  be  pointed  out  that  this  method  maintains  the  normal  commutativity  of  multiplication  and 
addition. 

After  repeated  trials,  it  was  determined  that  the  linear  mapping  best  represents  the  desired 
behavior  of  the  addition  and  multiplication  functions. 

3.5.3  Fuzzy  Arithmetic.  Fuzzy  arithmetic  is  created  by  using  the  extension  principle 
outlined  by  Zadeh  and  discussed  by  Schmucker  (Schmucker84:133).  The  extension  principle 
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allows  for  any  function  to  be  mapped  to  fuzzy  s^.  By  using  the  functional  definitions  of  addition, 
multiplicatitMi,  and  division  for  real  numbers  and  mapping  these  functions  into  the  fuzzy  set 
domain,  the  equations  shown  in  Equation  (3)  on  page  33  are  derived. 

The  big^t  problem  in  trying  to  implemrat  an  algebra  defining  fuzzy  arithmetic  involves 
the  lack  of  an  infinite  domain  space.  The  extension  principle  assumes  the  existence  of  an  infinite 
domain  space  made  up  of  all  d>e  possible  fuzzy  sets.  This  is  a  valid  assumption  for  the  theoretical 
generation  of  fuzzy  functions,  but  leads  to  a  problem  in  an  actual  implemratation  of  a  fuzzy 
algebra. 

To  demonstrate  this,  assume  there  exists  three  primary  fuzzy  sets:  LOW,  MEDIUM,  and 
HIGH.  Also  assume  a  hedge  or  modifier  has  been  defined:  VERY.  The  hedge  is  an  (^ration 
performed  on  a  primary  fuzzy  set.  The  problem  arises  in  the  semantic  meaning  of  the  results  when 
a  hedge  is  tq)plied  repeatedly.  Assume  VERY  is  applied  repeatedly  giving  the  result  of  VERY- 
VERY-VERY-VERY-VERY-VERY-LOW,  that  will  be  represented  here  as  VERY^  LOW.  While 
most  would  agree  that  there  is  a  different  semantic  meaning  to  VERY  LOW  and  VERY^  LOW, 
there  is  little  semantic  difference  between  VERY^  LOW  and  VERY^  LOW.  What  about 
VERY  100  LOW  and  VERY^Ol  LOW?  What  is  the  difference  between  VERY“  LOW  and 
VERY*^!  LOW? 

For  the  last  question,  I  conclude  there  is  none  based  on  the  definiticxi  that  oo-i  =  oo.  Where 
then  is  the  line  drawn  to  represent  difference?  The  line  is  drawn  subjectively  by  the  implementer  of 
the  fuzzy  algebra,  much  as  a  programmer  decides  the  number  of  significant  digits  used  to  represent 
real  values.  Given  a  subjective  cutoff  for  significance,  this  defines  a  finite  number  of  fuzzy  sets 
that  can  be  used  to  represent  all  the  values  possible  in  the  implementation  of  die  fuzzy  algebra. 

Another  problem  in  implementing  fuzzy  arithmetic  involves  the  semantic  meaning  of  the 
operations.  There  is  no  single  meaning  that  can  be  applied  to  performing  an  arithmetic  operation 
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on  fuzzy  sets.  Although  there  is  an  intuitive  meaning  to  die  term  "additioa",  eqiecially  with  real 
numbers,  the  intuiticm  falls  short  for  fuzzy  sets.  An  exan^ile  best  denKMistrates  this  lack  of 
intuition. 

Assume  the  problem  requires  adding  HIGH  and  LOW.  If  it  is  assumed  that  both  values 
are  positively  increasing  then  adding  LOW  to  HIGH  will  only  increase  HIGH,  possibly  to  VERY 
HIGH  (see  Figure  4).  Another  way  to  state  this  assumption  is  that  only  positive  values  can  be 
represented.  A  real  world  example  is  vulnerability  analysis.  If  the  influence  of  cme  vulnerability  is 
HIGH,  and  the  other  is  LOW,  then  the  total  influence  of  both  would  be  HIGH  to  VERY  HIGH. 

If  on  the  other  hand,  it  is  assumed  that  one  of  the  fuzzy  sets  represents  a  median  value, 
then  "adding"  LOW  to  HIGH  will  result  in  a  value  of  MEDIUM.  This  assumption  has  the  effect 
of  causing  those  fiizzy  sets  below  (linguistically  represent  smaller  values)  the  median  value  to  be 
'negative'  from  an  additive  point  of  view  and  those  above  Oinguistically  represent  larger  values)  the 
median  to  be  positive  (see  Figure  5).  The  net  effect  of  adding  two  fuzzy  sets  under  this  assumption 
is  to  generate  an  'average'  of  the  values  that  the  two  fiizzy  sets  represent.  Using  this  m^hod 
requires  an  odd  number  of  fiizzy  sets  be  defined.  If  the  influence  of  one  vulnerability  is  HIGH  and 
the  other  is  LOW,  then  the  influence  distributed  across  both  vulnerabilities  is  MEDIUM. 

Both  assumptions  concerning  the  meaning  of  "adding"  two  fuzzy  sets  are  valid,  but 
mutually  exclusive.  The  meaning  inqilied  by  the  "addition"  of  two  fuzzy  sets  is  subjective  and 
implementation  dependent. 

The  same  ambiguities  carry  over  into  multiplication.  Although  the  concept  of 
multiplication  when  dealing  with  real  numbers  is  simple,  fuzzy  set  multiplication  is  not  so  easy  to 
understand.  Multiplication  is  nothing  more  than  repetitive  addition.  The  multiplication  of  the 
integer  values  4  and  5  together  can  be  stated  as  the  addition  of  4  to  itself  5  times.  The  translation 
to  fiizzy  multiplication  becomes  confusing  when  HIGH  is  "added"  to  itself  LOW  times. 
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Figure  4.  Positive  Increasing  Fuzzy  Sets. 


Figures.  Median  Based  Fuzzy  Sets. 


If  the  first  assumption  concerning  addition  is  used,  then  the  multiplication  of  HIGH  and 
LOW  will  result  in  a  value  of  at  least  HIGH.  If  the  second  assumption  is  used,  then  the  result  will 
be  somewhere  near  MEDIUM.  Again  these  assumptions  seem  valid,  but  are  mutually  exclusive. 


42 


A  third  possibility  is  that  multiplication  performs  a  normalized  weighting  function.  In 
other  words,  the  result  of  multiplying  two  fiizzy  sets  together  tends  to  'shift'  c»ie  of  the  sets  towards 
the  other.  For  instance,  HIGH  multiplied  by  LOW  would  'shift'  HIGH  towards  LOW  and  result  in 
a  value  of  MEDIUM  HIGH  while  LOW  multiplied  by  HIGH  would  'shift'  LOW  towards  HIGH 
and  result  in  a  value  of  MEDIUM  LOW.  Immediately,  the  reader  should  notice  that  depending  on 
how  much  "shift"  is  caused  by  the  operation,  the  commutative  law  may  not  hold.  If  commutativity 
is  important  to  the  implementation,  the  "shift"  could  be  symmetric.  Also,  in  this  example,  the  first 
term  of  the  multiplication  is  'shifted'  by  the  second. 

The  best  way  to  see  how  this  works  is  with  an  example.  Suppose  we  are  trying  to 
determine  the  magnitude  of  a  budget.  If  the  HIGH  cost  items  only  occur  a  LOW  number  of  times, 
then  the  contribution  of  the  HIGH  cost  items  is  MEDIUM  HIGH.  Conversely,  if  the  LOW  cost 
items  occur  a  HIGH  number  of  times,  the  contribution  of  the  LOW  cost  items  is  MEDIUM  LOW. 

Division  of  fuzzy  sets  can  take  on  the  property  of  performing  a  relative  order  of  magnitude 
calculation.  Here,  it  is  assumed  that  the  median  fuzzy  set  approximates  the  equivalent  numerical 
value  of  one  (see  Figure  6).  Therefore,  those  fuzzy  sets  below  the  median  are  treated  as  if 
numerically  they  are  between  zero  and  one,  while  those  fuzzy  sets  above  the  median  are  treated  as 
greater  than  one.  Hence,  if  a  large  magnitude  number  is  divided  by  another  large  magnitude 
number,  the  result  is  somewhere  near  one.  If  a  large  magnitude  number  is  divided  by  a  very  small 
magnitude  number,  the  result  is  an  even  larger  magnitude  number.  Conversely,  a  small  magnitude 
number  divided  by  a  large  magnitude  number  results  in  an  even  smaller  magnitude  number  than  the 
original. 
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3,6  Scalability 


An  issue  that  plagues  many  algorithms  is  scal^ility.  In  tl^  case  of  the  statistical  analysis 
approach  outlined  on  page  29,  the  equations  scale  linearly  with  regard  to  the  data  set  size.  For  die 
Sctunucker  method,  as  well  as  the  Kaufmann  and  Gupta  method,  the  equations  do  not  scale  well. 
In  both  of  these  methods,  the  equations  will  cause  exponential  growth  in  the  size  of  the  fiizzy  set 
resulting  from  the  fuzzy  arithmetic  operations.  For  the  table  lookup  method,  the  equations  remain 
linear  regardless  of  the  data  set  size. 


3.7  Sample  Data  Generation 

In  order  to  test  the  methods,  sample  data  had  to  be  generated.  The  output  generated  by 
ARES  version  2.0  was  used  as  a  source  of  possible  vulnerabilities.  ARES  was  used  for 
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convenience,  but  a  random  sample  of  vulnerabilities  could  be  generated  by  hand.  The 
vulnerabilities  output  by  ARES  were  combined  into  a  single  file.  These  vulnerabilities  are  in  die 
form  of  text  strings  such  as  "The  system  does  not  use  passwords." 

This  list  of  vulnerabilities  was  then  put  into  a  LISP  object  class  structure  as  shown  in 
Figures  7  and  8.  Seven  object  slots,  one  for  each  functional  area  being  considered,  are  associated 
with  each  vulnerability  object.  Each  slot  value  represents  how  the  vulnerability  is  allocated  to  each 
functional  area  and  can  have  a  value  in  the  inclusive  range  of  0  to  1  for  the  numeric  case.  These 
values  were  randomly  generated.  For  the  linguistic  case,  a  term  is  used  to  represent  the 
approximate  distribution.  The  linguistic  terms  implemented  were  VERYLOW,  LOW, 
MEDIUMLOW,  MEDIUM,  MEDIUMHIGH,  HIGH,  and  VERYHIGH.  Each  of  these  terms  is 
subject  to  the  translation  mapping  discussed  in  the  previous  sections. 

3.8  Summary 

This  chapter  discussed  the  qualitative  and  quantitative  methods  used  in  this  thesis.  The 
quantitative  method  uses  statistical  analysis  techniques  familiar  to  most  readers,  while  the 
qualitative  methods  use  fuzzy  arithmetic  to  perform  intermediate  calculations  such  as  influence 
contribution.  Two  methods  to  model  fuzzy  arithmetic  are  given,  Schmucker’s  methods  and  a  table 
lookup  method  based  on  behavior  grouping.  There  is  a  preference  for  the  table  lookup  method 
because  of  its  linear  characteristics  with  regard  to  the  number  of  vulnerabilities  being  processed. 
Finally,  this  chapter  discussed  how  the  vulnerability  data  is  generated. 
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(setf  VlOO  (make -instance  ‘vuln-node 

:vuln  ‘The  system  does  not  have  audit  trails.  ■ 

:code-n2une  'VlOO 

: vuln-inf luence  0.7334 

:dist-.iudit  0.6833 

: dist-recover  0.4680 

:dist-access  0.5881 

:dist-media  0.1887 

:dist-os  0.8490 

:dist-configuration  0.4732 

:dist-documentation  0.5845 

: vuln-present  t 

_ n _ 

Figure  7.  Lisp  Object  with  Numerical  Data 

(setf  FlOO  (make-instance  'vuln-node 

:vuln  “The  system  does  not  have  audit  trails.  “ 

; code-name  'FlOO 
; vuln-inf luence  'H 
:dist-audit  'MH 
;dist-recover  'M 
:dist-access  'MH 
:dist-media  'L 
:dist-os  'H 

jdist-configuration  ‘M 
idist-documentation  'MH 
: vuln-present  t 

_ n _ 

Figures.  Lisp  Object  with  Fuzzy  Data 
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IV  ImplenieiiUtkHi 


4.1  Introduction 

In  this  chapter,  I  will  discuss  the  actual  implementaticm  details  for  two  of  the  analysis 
methods  identified  in  the  previous  chapter.  Again,  because  of  the  lack  of  an  industry  standard,  it 
was  left  to  my  discretion  as  to  how  and  what  to  implement.  I  felt  that  with  simpler  tools,  the 
comparison  between  methods  would  be  clearer.  Therefore,  I  impliemented  a  quantitative  method 
using  statistical  analysis  and  a  qualitative  method  using  fiizzy  logic. 

The  code  for  this  thesis  was  implemented  on  a  Sun  SPARCstation  2+  running  SunOS 
4.0.x.  The  code  was  implemented  using  Sun  Common  Lisp  4.0  with  CLOS  extensions.  The 
primary  purpose  of  using  Lisp  was  the  ease  with  which  long  linked-list  structures  are  handled. 
Both  the  numerical  and  fuzzy  logic  inqrlementations  used  the  same  CLOS  object  structure  as 
shown  in  Figure  9. 

4.2  Quantitative  Method  using  Statistical  Analysis 

The  quantitative  method  implemented  a  small  subset  of  statistical  measures.  These 
statistical  measures  were  weighted  averaging  and  standard  deviation.  These  measure  were  deemed 
adequate  to  demonstrate  the  type  of  information  derivable  from  the  data.  Other  more  conq)lex 
measures  could  have  been  implemented,  but  it  was  felt  they  would  simply  complicate  the 
comparison  between  methods  and  did  not  appear  to  provide  additional  information. 
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(defclass  vuln-node  0 
( (vuln 

: initform  '  ( ) 

:initarg  :vuln 
: accessor  vuln 

: documentation  'Text  string  with  name  of  vuln') 

(code-name 
: initform  ‘ ( ) 

:initarg  ; code-name 
; accessor  code-name 

: documentation  'code  for  internal  assignment  purposes') 

( vuln- inf luence 
: initform  0.0 
:initarg  :vuln-inf luence 
: accessor  vuln- inf luence 

: documentation  'weight  of  this  vulnerability  to  the  whole') 
(dist-audit 
:  initform  1.0 
:initarg  : dist-audit 
: accessor  dist-audit 

: documentation  'Degree  with  which  vuln  affects  audit') 
(dist-recover 
: initform  0.0 
:initarg  : dist-recover 
: accessor  dist-recover 

: documentation  'Degree  with  which  vuln  affects  recovery') 
(dist-access 
linitarg  : dist-access 
; initform  0.0 
; accessor  dist-access 

; documentation  "Degree  with  which  vuln  affects  access') 
(dist-media 
•.initform  0.0 
:initarg  jdist-media 
: accessor  dist-media 

: documentation  "Degree  with  which  vuln  affects  media  control") 
(dist-os 
: initform  0.0 
:initarg  : dist-os 
: accessor  dist-os 

: documentation  "Degree  with  which  vuln  affects  operating  system") 
(dist-configuration 
: initform  0.0 

sinitarg  : dist-configuration 
: accessor  dist-configuration 

: documentation  "Degree  with  which  vuln  affects  configuration”) 
(dist-documentation 
iinitarg  : dist-documentation 
; initform  0.0 

: accessor  dist-documentation 

; documentation  "Degree  with  which  vuln  affects  documentation") 
(vuln-present 
: initform  t 

.•initarg  : vuln-present 
: accessor  vuln-present 

.•documentation  "Is  the  vulnerability  present”) 

) ) _ 

Figure  9.  CLOS  Object  Structure 
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To  actually  determine  the  contribution  of  a  vulnerability  to  a  fiutctional  area,  die  [xoduct 
of  the  vulnerability  influence  and  the  distribution  value  for  that  functional  area  was  calculated. 
Then  based  on  these  contributions,  the  average  contribution  and  standard  deviaticHi  were  calculated 
for  the  data  sets.  The  contribution  values  were  then  sorted  and  the  vulnerabilities  in  or  matching 
the  top  10%  of  the  contributors  were  identified  as  critical.  The  possibility  exists  for  a  larger 
number  than  10%  to  be  identified  as  critical.  This  would  occur  as  a  result  of  contribution  values 
tying  for  inclusion  in  the  top  10%.  Since  there  is  no  reason  to  discriminate  against  these  value 
based  solely  on  position  within  the  sorted  contributions,  they  are  included  in  the  critical 
vulnerabilities. 

The  vulnerabilites  were  also  grouped  by  units  of  standard  deviation.  It  should  be  pointed 
out  that  the  10%  value  and  the  use  of  standard  deviation  are  arbitrary  thresholds.  It  was  necessary 
to  establish  some  threshold  in  order  to  determine  critical  vulnerabilities. 

4.3  Qualitative  Method  using  Fuzzy  Logic 

4.3.1  Fuzzy  Math.  In  order  to  make  comparison  between  the  methods,  the  same  type  of 
information  generated  by  the  quantitative  method  was  desired.  To  calculate  the  influence 
contributions,  it  is  necessary  to  multiply  the  influence  value  for  each  vulnerability  by  the 
distribution  values  given.  This  was  done  using  a  table  lookup.  The  lookup  tables  for  the  addition 
and  multiplication  of  fuzzy  values  are  given  in  Tables  1  and  2  respectively. 

As  mentioned  in  the  previous  chapter,  calculation  of  a  weighted  average  is  impossible  with 
linguistic  terms.  In  order  to  calculate  an  average  contribution  for  each  functional  area,  the  sum  of 
the  contributions  was  normalized  by  the  sum  of  the  vulnerbility  influences. 
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Just  like  in  the  quantitative  method,  the  vulnerabilities  in  or  matching  the  top  10%  of  the 
contributors  were  identified  as  critical  along  with  those  vulnerabilities  in  or  matching  the  top  10% 
of  more  than  one  functional  area.  Again,  ties  are  handled  by  being  included  in  the  identification  of 
critical  vulnerabilities.  The  vulnerabilities  were  also  grouped  by  linguitic  value  in  order  provide 
additional  insight  into  the  structuring  of  vulnerabilities  for  each  functional  area.  This  grouping  of 
vulnerabilities  was  based  on  linguistic  values.  Also  shown  on  the  output  is  the  number  of  each 
linguistic  value  occurring  as  a  contribution  value  to  a  specific  functional  area.  Tliis  data  is 
provided  to  show  the  range  of  values  occurring  within  each  functional  area. 

Although  I  implemented  a  version  of  Schmucker's  method  and  attempted  to  execute  this 
method  against  a  sample  of  50  vulnerabilities,  this  method  never  successfully  completed  execution. 
The  problem  was  not  with  the  implementation,  but  combinatorial  explosion.  After  running  for  over 
24  hours,  the  Schmucker  method  would  consistently  cause  out  of  memory  errors.  As  such,  I  was 
never  able  to  acquire  results  using  this  method  on  a  complete  sample  of  vulnerabilities. 

As  mentioned  in  the  previous  chapter,  these  lookup  tables  were  built  based  on  the  majority 
behavior  of  the  indicated  operation.  There  were  a  few  cases  where  the  majority  was  only  slightly 
larger  than  the  minority.  It  is  possible  because  of  this  to  see  slightly  unexpected  behavior  if  this 
table  lookup  method  is  compared  to  a  numeric  method.  If  the  numeric  values  fall  near  the  mapping 
boundaries,  the  linguistic  result  may  be  off  by  at  most  one  category.  In  no  case  did  the  boundary 
shift  by  more  than  one  fuzzy  term. 

4.4  Summary 

This  chapter  described  how  each  of  the  vulnerability  analysis  methods  were  implemented. 
Also  discussed  was  how  the  critical  vulnerabilities  were  identified  and  how  the  vulnerabilities  were 
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grouped  acc(Mxiing  to  units  of  standard  deviadcxi  in  dte  quantitative  case  and  linguistic  tmns  in  the 
qualitative  case.  Finally,  how  the  lookup  tables  were  built  and  problems  with  boundary  ccMiditions 
were  discussed. 
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V  Results 


5.1  Introduction 

This  chapter  will  outline  the  results  obtained  from  the  meduxls  and  implementations  given 
in  Chapters  3  and  4.  Although  the  Schnuicker  method  was  implememed,  that  method  will  not  be 
used  for  comparison  purposes  due  to  its  significant  {»oblem  with  scalability.  The  discussion  of  the 
comparison  results  will  only  be  based  on  the  quantitative  analysis  method  using  statistical  analysis 
and  the  qualitative  analysis  method  using  the  Ud>te  lodnip  of  fiizzy  arithmetic  functions. 

Given  that  there  is  no  industry  standard  with  which  to  conq>are  these  results,  some  of  die 
comparisons  given  below  have  to  be  subjective  in  nature.  When  a  subjective  comparison  is  made, 
an  attempt  is  made  to  explain  the  basis  for  the  comparison  and  how  the  results  were  interpreted. 

5.2  Hypothesis  (restated) 

The  hypothesis  of  this  thesis  is  stated  in  two  parts;  that  a  qualitative  analysis  ^proach  to 
vulnerability  analysis  is  as  effective  and  efficient  as  a  quantitative  approach  and  that  the  qualitative 
approach  provides  the  security  analyst  with  intuitive  information  not  readily  available  in 
quantitative  iqiproaches.  Effectiveness  is  the  ability  to  provide  reasonable  categorizations  of 
identified  vulnerabilities  based  on  influence  contributions  to  a  functional  area.  It  is  measured  with 
regard  to  a  method's  ability  to  categorize  vulnerabilities  into  reasonable  clusters  and  how  easy  it  is 
for  that  method  to  be  used.  Efficiency  is  measured  with  regard  to  processing  time  and  scalability 
of  the  method. 
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II 


5.3  ElffeetivtHess 

When  evaluating  any  method  to  perform  a  specific  task,  it  is  essential  to  determine  the 
effectiveness  of  that  method  in  perftxming  the  task.  For  computer  security  vulnnability  analysis,  a 
method  is  effective  if  it  can  categorize  or  cluster  the  vulnerabilities  into  reasonable  groupings  based 
on  impratance  or  iii^)act.  This  ability  is  necessary,  but  it  is  not  sufficient  for  determining 
effectiveness.  The  ease  with  which  the  method  can  be  applied  must  also  be  considered.  A  mediod 
may  be  very  effective  at  performing  categorizations,  but  if  it  requires  extmsive  data  setup  or  the 
results  of  the  method  are  difficult  to  interpret,  most  would  agree  that  die  medxxl  loses  its 
effectiveness. 

5.3.1  AbUity  to  Categorizie  VutnerabiUdes  The  measurement  of  the  ability  of  a  given 
method  to  categorize  vulnerabilities  is  divided  into  four  main  concerns:  accuracy,  precision, 
generation  of  distinctions,  and  evaluation  of  close  results.  Most  of  these  concerns  are  evaluated  on 
a  subjective  basis  and  the  evaluation  may  depend  cm  the  specific  triplication  of  the  method. 

5.3.1.1  Accuracy.  A  concern  with  any  method  is  the  accuracy  of  the  results.  A 
true  determination  of  accuracy  requires  that  an  accepted  mediod  be  the  baseline  to  compare  the 
results  of  other  methods  against.  This  presented  an  insurmountable  i»oblem  as  there  is  no  industry 
or  academic  standard  method. 

However,  a  subjective  determination  of  accuracy  was  possible  by  loddng  at  broad 
clusterings  of  the  vulnerabilities.  In  this,  I  sought  to  determine  if  those  vulnerabilities  that  were  on 
the  high  end  of  importance  scale  for  one  mediod  woe  also  on  the  hi^  end  for  the  other  method. 
Note  that  this  does  not  imply  that  the  actual  order  of  vulnerabilities  is  the  same  for  both  mediods, 
only  that  the  broad  clusterings  were  similar.  Given  this  subjective  mediod,  die  two  m^hods 
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produced  comparable  broad  clusters  (see  Figures  10  aiKl  1 1).  Reviewing  the  data  shown  in  these 
figures  shows  that  vulnerabilities  VlOO  and  V125  and  the  correspimding  vulnerabilities  F1(X)  and 
F125  were  clustered  in  the  top  category.  Likewise,  vulnerabilities  V121  and  V126  and  the 
corresponding  F121  and  F126  were  clustered  in  the  lowest  category.  I  cannot  say,  nor  is  it 
possible  to  without  a  baseline  standard,  that  one  method  is  more  accurate  than  the  other.  I  can  say, 
that  based  on  the  subjective  broad  clusterings,  each  method  a{q)ears  to  have  comparable  accuracy. 
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Figure  10.  Oustering  of  Audit  Flinctional  Area  (Quantitative) 
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Figure  11.  Clustering  of  Audit  Flmctional  Area  (Qualitative) 


5.3, 1.2  Precision.  Another  concern  in  evaluating  the  effectiveness  of  a  method 
is  precision.  Precision  differs  from  accuracy  and  the  two  should  not  be  confused.  Accuracy 
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implies  a  degree  of  correctness,  whereas  precision  is  the  degree  of  resolutitm.  For  instance,  3.14 
and  3.14S9265  are  both  estimates  of  the  value  of  pi.  Neither  is  completely  accurate,  but  the 
second  value  is  more  precise.  So  is  1.5923111,  but  it  is  obviously  less  accurate.  This  illustrates 
that  high  precision  does  not  imply  high  accuracy. 

In  evaluating  the  precision  of  the  two  methods,  most  would  agree  that  a  numeric  solution 
would  have  a  higher  degree  of  precision  over  a  non-numeric  solution.  This  can  be  seen  in  that  the 
numeric  solution  has,  at  least  theoretically,  an  infinite  degree  of  precision,  while  the  non-numeric 
solution  is  limited  to  the  resolution  provided  by  the  linguistic  terms. 

Since  the  numeric  solution  is  more  precise,  it  could  be  inferred  that  this  added  precision 
also  adds  information  to  the  results  and  as  such,  this  added  information  should  be  usable.  The 
problem  with  making  this  inference  is  that  the  origiiud  input  to  the  problem,  the  subjective 
assignment  of  influence  and  distribution  for  each  vulnerability,  lacks  precision. 

In  the  numeric  method,  the  analyst  has  an  infmite  range  of  values  between  0  and  1  that  can 
be  assigned  to  each  influence  or  distribution  value.  While  this  allows  the  analyst  to  provide  a 
higher  degree  of  resolution  of  the  input,  it  does  not  necessarily  add  information  to  the  analysis.  For 
example,  the  analyst  is  providing  influence  values  to  two  vulnerabilities,  and  while  they  both  have 
a  subjective  rating  of  medium,  the  analyst  wants  one  to  be  slightly  more  medium  than  the  other. 
She  therefore  assigns  one  an  influence  value  of  0.51  and  the  other  an  influence  value  of  0.55.  Also 
assume  that  for  a  given  functional  area,  both  vulnerabilities  have  a  distribution  value  of  1.0.  When 
all  of  the  calculations  are  performed  and  the  vulnerabilities  have  been  clustered,  it  is  possible  that 
the  two  vulnerabilities  will  fall  into  the  same  cluster. 

In  reality,  providing  this  level  of  precision  on  the  input  has  not  affected  the  overall 
assignment  of  the  vulnerabilities  to  a  specific  cluster.  Likewise,  it  is  possible  that  increasing  the 
resolution  of  the  output  values  will  not  change  the  clustering.  What  increasing  the  precision  in  the 
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data  input  and  the  results  output  would  do  is  provide  a  false  sense  of  in^roved  accuracy.  The  case 
where  the  two  vulnerabilities  do  fall  into  different  clusters  will  be  discussed  below  concerning 
evaluation  of  close  results. 

The  level  of  precision  provided  by  a  method  is  directly  affected  by  the  true  precision  of  the 
data  input  into  that  method.  If  the  analysis  starts  with  inherently  imprecise  data,  it  is  not 
reasonable  that  the  true  accuracy  of  the  results  will  increase  just  because  the  data  is  expressed  in 
terms  of  increased  precision.  While  it  is  true  that  the  numeric  method  has  a  greater  possible 
precision  due  to  increased  resolution  of  the  input  values,  I  contend  that  this  increased  precision 
simply  leads  to  a  false  assumption  that  the  results  are  more  accurate.  Simply  put,  subjective  inputs 
lead  to  subjective  outputs  and  precise  subjective  inputs  lead  to  precise  subjective  outputs.  The  key 
point  is  that  the  outputs,  regardless  of  how  precise  the  inputs,  is  still  subjective. 

5.3. 1.3  Generation  of  Distinctions.  In  performing  the  categorization  of 
vulnerabilities,  it  is  necessary  to  generate  distinctions  among  data  values.  By  generating  these 
distinctions,  the  method  is  able  to  group  the  information  into  clusters.  The  concern  here  is  how 
effective  are  the  two  methods  at  generating  these  distinctions. 

In  the  non-numeric  case,  the  most  obvious  distinction  is  by  linguistic  value.  Grouping  the 
analysis  results  by  linguistic  values  provides  an  intuitive  clustering  of  the  information.  For 
instance,  it  make  sense  to  group  all  of  the  vulnerabilities  with  a  VERYHIGH  influence  contribution 
together,  then  the  vulnerabilities  with  a  HIGH  influence  contribution,  and  so  forth.  Even  within 
these  clusters,  further  distinctions  can  be  made  based  on  linguistic  values  by  grouping  the 
vulnerability  influence  values  that  are  the  same  or  by  grouping  the  vulnerability  distribution  values 
for  a  specific  functional  area.  The  distinctions  are  appropriate  given  the  desired  goal  of  trying  to 
cluster  all  of  the  vulnerabilities  by  importance. 


56 


In  the  numeric  case,  there  is  no  obvious  distinction.  Any  distinction  made  is  arbitrary 
since  the  range  of  possible  values  is  infmite.  Some  possible  distinctions  that  could  be  made  would 
include  grouping  the  data  by  units  of  standard  deviation  (the  choice  used  in  this  thesis),  develc^ing 
confidence  intervals,  or  calculating  histogram  clusterings.  There  are  a  plethora  of  statistical 
methods  that  could  be  used  to  cluster  the  vulnerability  data,  but  none  is  sufficient  for  all 
circumstances.  This  is  one  of  the  primary  reasons  that  no  industry  standard  method  has  been 
developed. 

Again,  it  must  be  stressed  that  the  original  input  values  to  either  method  are  subjective  and 
as  such  the  output  of  either  method  is  subjective.  Any  distinctions  made  between  adjacent  data 
points  in  the  numeric  method  are  arbitrary. 

For  instance,  assume  an  university  chooses  its  distinguished  graduates  from  the  students 
within  the  top  ten  percent  of  the  graduating  class's  grade  point  average  (GPA).  The  cutoff,  as 
determined  by  numeric  methods  is  3.9781.  All  students  whose  GPA  is  3.9781  and  greater  are 
classified  as  distinguished  graduates.  Assume  another  student  has  a  GPA  of  3.9780.  Using  a 
strictly  numeric  cutoff  eliminates  this  student  from  being  a  distinguished  graduate.  Also  note,  that 
if  the  precision  of  the  cutoff  was  reduced  to  3.978,  and  all  GPAs  rounded  to  three  digits,  the 
student  would  be  selected.  Of  course,  this  selection  of  three  significant  digits  is  just  as  arbitrary  as 
four  digits  and  the  same  boundary  condition  occurs  for  someone  with  a  GPA  of  3.9774. 

The  purpose  of  this  example  is  to  show  how  placing  an  arbitrary  and  fixed  cutoff  to  the 
data  being  used  can  lead  to  undesired  results.  In  performing  the  distinguished  graduate  selection, 
the  university's  real  goal  was  to  reward  those  students  whose  academic  achievement  place  them  in 
or  very  near  the  top  ten  percent  of  their  class. 

5. 3. 1.4  Evaluation  of  Close  Results.  The  examples  given  in  the  previous 
section  bring  to  light  the  problem  of  how  to  evaluate  close  results.  Close  results  are  those  results 


that  fall  on  or  near  the  distinction  cutoffs  being  used.  In  the  case  of  non-numeric  method,  it  is 
adequate  to  say  that  a  result  is  close  to  another  if  they  have  the  same  linguistic  values.  This  is 
adequate  because  the  linguistic  values  represent  ranges  of  values  and  these  ranges  are  fairly  large. 
This  has  the  effect  of  enforcing  the  desired  behavior  that  if  two  vulnerabilities  are  close  in 
importance,  then  they  should  be  considered  together  and  it  is  not  reasonable  to  distinguish  between 
them. 

In  the  numeric  case,  as  was  demonstrated  by  the  above  examples,  two  values  can  be 
numerically  close  and  still  fall  into  different  clusters.  This  is  true  even  if  a  histogrammatic 
program  is  used  to  try  and  determine  the  'natural'  clusters.  This  is  caused  by  the  exactness  of  the 
numeric  method  where  each  value  used  is  in  effect  its  own  cluster.  Because  of  the  infmite  number 
of  individual  clusters,  the  numeric  method  will  apply  the  chosen  arbitrary  cutoff  between  two 
vulnerabilities.  In  reality,  these  two  vulnerabilities  should  be  considered  together,  but  the  numeric 
method  may  be  unable  to  detect  this  closeness  and  arbitrarily  applies  the  cutoff. 

5.3.2  Ease  of  Use  Ease  of  use  can  only  be  measured  subjectively.  What  is  easy  for  one 
person  to  use,  may  be  difficult  for  another  person.  However,  even  with  this  in  mind,  there  are 
certain  attributes  of  each  method  that  are  demonstrative  of  their  ease  of  use.  The  most  important 
measure  of  ease  of  use  is  the  ease  of  interpreting  the  data. 

5.3.2.1  Interpretation  of  Data.  As  with  any  product,  the  main  goal  is  to  make 
meaningful  and  consistent  interpretations  of  the  output  results.  A  consistent  interpretation  of  the 
results  implies  that  a  single  user  construes  the  same  information  from  the  results  every  time  the 
results  are  reviewed.  A  consistent  interpretation  also  implies  that  there  lacks  ambiguity  in  what  the 
results  mean.  A  meaningful  interpretation  is  much  harder  to  define,  but  can  be  seen  as  whether  the 
analyst  must  try  and  guess  what  the  meaning  of  the  values  output  represent.  Another  definition  of 
meaningful  interpretation  could  be  whether  the  data  has  an  implied  semantic  meaning. 
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The  first  interpretation  of  data  is  made  with  regard  to  the  input  data.  It  is  essential  that  the 
input  data  be  generated  and  interpreted  consistently.  Assume  for  the  numeric  case,  the  user  is 
asked  to  assign  an  influence  value  to  a  vulnerability  with  the  possible  choices  being  between  0  and 
100,  and  the  user  selects  56.  Would  there  have  been  any  significance  to  choosing  56  over  57?  Not 
in  reality,  unless  the  equations  used  to  perform  the  calculations  are  extremely  sensitive  to  small 
changes  in  input.  As  such  a  sensitive  system  would  probably  not  be  fielded  in  the  first  place,  I 
assert  there  is  no  significant  difference  in  choosing  56  over  57  in  the  preceding  question. 

Some  might  argue  that  the  user  has  too  many  choices,  so  the  choices  are  lowered  from  0  to 
100  to  the  range  0  to  9.  Here  we  would  expect  that  there  is  a  significant  difference  between  5  and 
6.  Lowering  the  number  of  choices  improved  the  resolution  of  choice  significance.  What  then  is 
the  significance  of  the  user  assigning  5  (from  the  choice  of  0  to  9)  to  an  influence  value?  This 
could  now  be  interpreted  to  mean  that  the  vulnerability  has  a  medium  influence  value.  I  contend 
that  the  user  should  input  the  value  MEDIUM  that  conveys  the  semantic  meaning  intended  by  the 
user. 

The  same  semantic  difficulties  arise  in  interpreting  system  output.  After  processing  all  of 
the  identified  vulnerabilities  using  whatever  numeric  methods  chosen,  the  system  generates  an 
overall  vulnerability  rating  of  8  from  the  range  of  0  to  9.  This  could  be  interpreted  to  mean  a  high 
vulnerability  rating  if  the  value  of  9  represents  the  high  end  of  the  scale.  It  could  also  be 
interpreted  to  mean  a  low  rating  if  9  represents  the  low  end  of  the  rating  scale.  If  on  the  other 
hand,  the  system  indicates  that  the  overall  vulnerability  rating  is  HIGH,  there  is  little  left  to 
interpret. 

In  the  numeric  case,  the  user  is  presented  with  a  list  of  statistical  values  that  attempt  to 
represent  how  the  vulnerabilities  are  important  to  each  functional  area.  In  most  cases,  the  analyst 
makes  a  internal  conversion  from  the  numeric  values  to  their  intended  linguistic  meaning.  Internal 
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conversion  implies  that  the  values  are  converted  in  the  analyst  head  as  opposed  to  a  algorithm 
implemented  on  a  computer.  The  analyst  would  look  at  the  data  and  build  internal  subjective 
scales  to  evaluate  the  meaning  of  the  numeric  values.  The  inconsistency  arises  when  a  few  days, 
months,  or  years  later,  the  same  analyst  reviews  the  output  results.  It  is  up  to  the  analyst  to  try  and 
remember  the  exact  interpretation  made  when  the  results  were  previously  reviewed. 

There  can  also  be  inconsistencies  when  multiple  analysts  are  reviewing  the  numerical 
information.  It  is  very  unlikely  that  any  two  analysts  would  generate  the  same  internal  scales  to 
evaluate  the  meaning  of  the  numeric  values. 

In  the  non-numeric  case,  there  is  a  fairly  standard  meaning  associated  with  each  of  the 
linguistic  terms.  These  meanings  are  fairly  standard  in  that  most  security  analysts  would  have 
similar  internal  representations  of  the  concepts  HIGH,  MEDIUM,  and  LOW.  There  may  be  some 
latitude  in  the  specific  meanings  associated  with  compound  linguistic  terms  such  as  VERYLOW 
and  MEDIUMHIGH,  but  the  general  meanings  associated  with  these  types  of  terms  should  be 
consistent  within  a  particular  domain. 

Looking  at  the  non-numeric  data,  the  analyst  is  not  left  to  speculate  as  to  whether  a 
vulnerability  has  a  HIGH  or  VERYHIGH  rating;  the  data  simply  states  the  value.  The  analyst  is 
also  not  required  to  remember  from  day  to  day,  any  numeric  range  of  values  that  constitute  a 
particular  linguistic  term. 

It  is  in  this  category  of  ease  of  interpretation  of  data  that  the  non-numeric  method  excels 
the  greatest  over  the  numeric  method. 
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5.4  Efficiency 


5.4.1  Timed  Performance.  The  timed  performances  for  each  of  the  methods  were 
comparable.  The  actual  times,  in  seconds,  are  shown  in  Table  3  and  plotted  in  Figure  12.  The 
performance  was  almost  identical  for  the  SO  vulnerability  sample,  while  the  numerical  method  was 
better  for  the  120  data  sample.  The  198,  2S0,  320,  and  400  sample  data  sets  are  combinations  of 
the  smaller  sets  generated  only  for  the  purpose  of  timing.  The  data  shows  that  the  two  methods 
perform  comparably  and  therefore,  for  vulnerability  analysis,  neither  method  is  a  clear  winner. 

5.4.2  Scalability.  As  mentioned  in  previous  chapters,  both  methods  should  demonstrate 
linear  scalability.  This  is  demonstrated  by  Figure  12.  It  should  be  noted  that  a  portion  of  the  time 
shown  in  Table  3  can  be  attributed  to  programming  overtiead  that  is  not  related  to  data  set  size. 
Given  the  linearity  of  each  method,  either  would  be  appropriate  for  vulnerability  analysis. 

5.5  Summary 

The  above  results  demonstrate  that  both  methods  perform  equally  well  with  regard  to 
timed  performance,  and  scalability.  However,  there  are  significant  differences  in  the  ability  of  the 
two  methods  to  categorized  the  vulnerability  information  and  the  ease  of  which  the  data  car  be 
interpreted. 

For  the  average  user,  the  fuzzy  analysis  method  has  the  advantage  here.  The  thought 
process  associated  with  assigning  influence  values  and  how  these  influence  values  are  distributed  is 
much  more  natural  with  the  fuzzy  logic  approach.  Also,  the  analysis  of  the  end  product  is  more 
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intuitive  when  using  linguistic  terms  as  opposed  to  numerical  values.  This  advantage  is  bora  out 
by  a  comment  included  in  Schmucker; 

A  higher  degree  of  response  consistency  over  trials  was  found  to  occur  if  the 
subject  is  allowed  to  give  an  imprecise  verbal  response  about  a  fuzzy  (concept) 
than  if  he  is  forced  to  give  a  precise  "grade-of-membership  answer  [Kochen, 

1975].  (Schmucker84:35) 


■BB 

Fuzzy  Best 

0.69 

1.60 

2.96 

3.62 

4.84 

6.40 

Fuzzy  Worst 

0.76 

1.66 

3.20 

3.68 

4.89 

6.50 

Fuzzy  Average 

0.72 

1.63 

3.08 

3.64 

4.87 

6.45 

Numeric  Best 

0.68 

1.12 

3.12 

3.64 

4.32 

6.78 

Numeric  Worst 

0.77 

1.19 

3.59 

3.95 

4.47 

7.06 

Numeric  Average 

0.72 

1.16 

3.40 

3.76 

4.39 

6.90 

Table  3.  Timed  perfonnance  (Numerical  vs.  Fuzzy) 


Figure  12.  Timed  performance  (Numerical  vs.  Fuzzy) 
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VI  Condudons  and  Recommendations 


6.1  Conclusions 

This  thesis  has  demonstrated  the  feasibility  of  using  a  non-traditional  meduxl  to  process 
and  analyze  computer  security  vulnerabilities.  It  has  demonstrated  that  not  cmly  is  this  non- 
traditional  method  comparable  to  a  similarly  structured  quantitative  method,  but  with  regard  to 
ease  of  use,  better.  Although  fuzzy  logic  was  the  non-traditional  method  used  in  this  thesis,  other 
methods,  such  as  those  listed  in  chapter  2,  could  possibly  be  q)plied. 

The  use  of  linguistic  terms  allows  end  user  to  make  meaningful  evaluations  of 
vulnerability  influence  and  impact.  Unlike  numbers,  a  linguistic  term  has  a  semantic  meaning  as 
well  as  representing  a  quantity. 

6.2  Project  Recommendations 

This  thesis  recommends  the  use  of  a  qualitative  analysis  method  to  perform  vulnerability 
analysis.  This  recommendation  is  based  on  the  ease  of  use  and  intuitive  nature  of  qualitative 
analysis  methods  such  as  fuzzy  logic. 

6.3  Future  Enhancements  and  Phases 

A  future  enhancement  and  continuation  of  this  work  would  be  to  create  an  inference 
mechanism  to  combine  the  results  of  each  functional  area.  This  inference  mechanism  could  be  in 
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the  foirn  of  rules  that  would  direct  how  to  interpret  the  results.  The  final  ou^t  of  such  an 
inference  mechanism  could  be  a  single  vulnerability  rating  or  score. 

As  the  first  phase  of  a  multi-phase  research  project,  this  thesis  lays  the  groundwork  for 
building  a  fully  automated  computer  security  risk  analysis  system.  In  addition  to  investigating  and 
implementing  the  recommendations  listed  in  the  following  section,  future  phases  of  the  research 
project  include  addressing  implementation  tradeoffs  between  risk  analysis  capabilities  and  host 
architecture  constraints  (e.g.,  memory  limitations  and  microprocessor  performance),  develqiing 
adaptive  analysis  techniques  to  allow  for  and  take  advantage  of  new  technology,  and  generation  of 
new  reasoning  with  uncertainty  methods. 

This  thesis  has  addressed  only  one  specific  part  of  automating  computer  security: 
specifically,  whether  a  qualitative  or  quantitative  analysis  method  is  more  applicable  to 
vulnerability  analysis.  As  the  use  of  automated  methods  to  evaluate  computer  security  is  still  in  its 
infancy,  tlic  possibilities  for  further  research  are  almost  limitless.  The  following  paragraphs 
suggest  a  few  of  these  areas  where  additional  work  could,  and  probably  should,  be  done. 

The  vulnerability  assessment  methods  explored  in  this  thesis  could  be  expanded  to  include 
combinations  of  multiple  vulnerabilities.  This  thesis  dealt  with  evaluating  individual  vulnerabilities 
that  were  independent  of  each  other.  Future  research  should  be  conducted  which  will  allow  for 
combinations  of  vulnerabilities  to  be  assessed.  An  example  would  be  a  system  that  lacks 
passwords  and  the  facility  where  the  system  is  located  is  not  locked.  Each  of  these  alone  has  a 
vulnerability  rating  associated  with  it,  but  the  combined  effect  of  both  vulnerabilities  may  have  a 
much  higher  rating  than  either  alone. 

Another  area  that  should  be  explored  is  the  generation  of  other  non-numeric  analysis 
methods  that  provide  linear,  or  very  near  linear,  performance.  Along  with  this  research  would  be 
analysis  into  the  number  of  linguistic  terms  necessary  to  adequately  represent  the  range  of  possible 
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vulnerability  levels  and  analysis  into  bow  the  size  and  structure  of  the  locdcup  table  affects 
predictability  of  results. 

Effort  should  be  expended  towards  the  development  of  an  expert  system  that  is  c^yiable  of 
making  the  necessary  recommendations  for  vulnerability  correction  and  threat  eliminaticm. 
Currently,  the  USAF  depends  on  experts  at  AFCSC  and  the  CSOs  to  determine  the  necessary  steps 
to  correct  vulnerabilities  and  eliminate  threats.  Codifying  the  expertise  of  these  individuals  into  a 
portable  expert  system  would  allow  new  and  less  experienced  CSOs  to  provide  enhanced  and 
reliable  protection  of  their  resources. 

Work  could  be  done  on  using  hardware  and  software  design  specifications  to  determine 
security  flaws.  This  would  allow  for  security  flaws  to  be  corrected  prior  to  the  system  being 
in^lemented.  Most  hardware  and  software  systems  are  built  with  performance  and  ease  of  use  as 
driving  factors.  Security  is  often  not  considered  until  it  is  too  late  to  correct  a  flaw.  Using 
information  from  the  specification  and  design  phases,  an  assessment  could  be  performed  to  identify 
possible  security  weaknesses  prior  to  the  system  being  built.  This  earlier  assessment  would  allow 
the  system  to  be  built  with  most  if  not  all  of  the  security  flaws  corrected. 

Future  work  could  be  put  forth  on  developing  a  system  capable  of  performing  threat 
assessment  to  determine  if  known  threats  could  penetrate  the  current  safeguards.  This  research 
would  involve  building  a  simulation  model  of  each  system's  safeguards  and  then  applying  the 
known  threats  to  the  model.  As  a  continuation  of  the  threat  assessment  concept,  automatic  threat 
scenario  generation  could  be  developed  that  would  generate  a  possible  sequence  of  events  leading 
to  the  compromise  of  a  computer  system.  The  only  drawback  to  this  type  of  research  would  be  the 
sensitivity.  It's  obvious  that  any  system  that  generates  and  evaluates  threat  scenarios  would  have 
to  be  highly  classifted. 
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Appendix  A:  Computer  Programs 


A-1  Quantitative  Method  using  Statistical  Analysis 


The  following  code  excerpts  are  provided  to  illustrate  high  level  functicMiality  of  the 
implemented  code. 


Functions  used  to  generate  statistics 


(defun  sum-eml  (slot-name) 

‘Calculate  influence  values  for  a  particular  functional  area' 

(setf  suml  0) 

(mapc  #' (lambda  (x) 

(progn 

(cond 

( (vuln-present  (eval  x) ) 

(setf  value  (get-value  (eval  x)  slot-name)) 

(setf  value2  (vuln-influence  (eval  x) ) ) 

(setf  suml  (+  suml  (*  value  value2)))) 

(t  (setf  suml  (+  0  suml)))))) 
vuln-list) 

suml) 

(defun  sum-em2  (slot-name) 

'Calculate  the  squared  influence  values  for  a  particular  functional  area' 
(setf  sum2  0) 

(mapc  #' (lambda  (x) 

(progn 

(cond 

( (vuln-present  (eval  x) ) 

(setf  value  (get-value  (eval  x)  slot-name)) 

(setf  value2  (vuln-influence  (eval  x) ) ) 

(setf  sum2  (+  sum2  (expt  (*  value  vaiue2)  2  )))) 

(t  (setf  sxim2  (+  0  suml)))))) 
vuln-list) 
sum2) 

(defun  Slim- squared  () 

'Initialize  variedile  for  sum  of  each  slot' 

(setf  ssum  (mapcar  #'sum-em2  dist-slots) ) ) 

(defun  rawsuml  () 

"Initialize  variable  for  sum  of  each  slot" 

(setf  inf-sum  (sum-em  'vuln-influence)) 

(setf  areas  (mapcar  #'sum-eml  dist-slots)) 

(setf  sum-areas  (apply  #'+  areas)) 

(setf  areas-pct  (mapcar  #' (lambda  (x)  (/  x  sum-areas))  areas)) 

(setf  sum-areas-pct  (apply  '+  areas-pct)) 
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areas) 


(defvin  calc-inf  () 

‘Calculate  the  influence  for  each  functional  area* 

(setf  inf-calculated  t) 

(preponderance)  ;  generate  list  of  i  vuln  present  in  each  area 
(rawsuml)  ;  generate  raw  weighted  suras 
(setf  area-avgs  (raapcar  '/  areas  prepond)) 

(s\nn- squared) 

(setf  variance  (raapcar  #' (larabda  (n  x  x2) 

(progn 

(*  (/  1  (1-  n)) 

(-  x2  (*  (/  1  n) 

(ejq>t  X  2)))))) 
prepond  areas  ssum) ) 

(setf  stdev  (mapceur  i'sqrt  variance)) 

) 


Functions  used  to  generate  clusterings 


(defun  count-sd-groups  (slot-name) 

(cond 

( (eval  inf -calculated)  nil) 

(t  (calc-inf)  (gen-sorted- inf-list) ) ) 

(setg  posv  (position  slot-name  dist-slots)) 

(setg  meocvuln  (car  (nth  posv  sorted-inf-list) ) ) 

(setq  nwucval  (*  (get-value  (eval  maxvuln)  slot-name) 
(vuln-influence  (eval  meixvuln) ) ) ) 

(setq  sd  (nth  posv  stdev)) 

(multiple-value-setq  (sdgroups  junJc)  (ceiling  (/  maxval  sd) ) ) 
sdgroups ) 

(defun  group-sd-vals  (slot-name) 

(setq  groups  (count-sd-groups  slot-name)) 

(setq  posv  (position  slot-name  dist-slots)) 

(setq  maxvuln  (car  (nth  posv  sorted-inf-list) ) ) 

(setq  meocval  (*  (get-value  (eval  maxvuln)  slot-name) 
(vuln-influence  (eval  nuucvuln) ) ) ) 

(setq  sd  (nth  posv  stdev)) 

(setf  sd-groups  (list  maxval)) 

(loop  for  num  from  2  to  groups 
do 

(progn 

(setf  minval  (-  maxval  sd) ) 

(cond 

( (<  minval  0)  (setf  minval  0))  (t  nil)) 

(setf  sd-groups  (append  sd-groups  (list  minval) ) ) 

(setf  nuucval  minval) ) ) 
sd-groups) 

(defun  count-sd-groups-content  (slot-name) 

(setq  niimgrps  (count-sd-groups  slot-name)) 

(setq  grprngs  (group-sd-vals  slot-name)) 

(setq  array  (malce-list  numgrps  : initial-element  0)) 

(setq  posf  (position  slot-name  dist-slots) ) 

(setq  temp-sorted-list  (copy-list  (nth  posf  sorted-inf-list))) 
(loop  for  posl  from  0  to  (1-  (length  tenqp-sorted-list) ) 
do 

(let* 
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((teiqp-vuln  (nth  posl  tenp-sorted-list) ) 

(valuel  (9«t-valu«  (aval  tenp-wln)  alot-nana) ) 

(value2  (vuln-influence  (aval  tan^-vuln) ) ) 

(valua3  (*  valual  valua2)) 

(poBX  (position  valua3  grpmgs  :ta8t  *'<  ifrom-and  t) ) 
) 

(cond 

((null  posx)  nil) 

(t  (sat£  (nth  posx  array)  (!■•■  (nth  posx  array))))) 

)) 

array) 

(dafun  group-sd-contant  (slot-nama) 

(satg  groups  (count-sd-groups  slot-nama)) 

(satq  arrayl  (malca-list  groups  : initial -alament  nil)) 

(satg  pos£  (position  slot-nama  dist-slots) ) 

(satq  tsl  (copy-list  (nth  pos£  sort8d-in£-iist) ) ) 

(satq  total-pos  0) 

(satq  Hist  (count-sd-groups-contant  slot-neune) ) 

(satq  l£uz  (1-  groups)) 

(loop  £or  posl  £rom  0  to  l£uz 
do 

(lot* 

((inc  (nth  posl  Hist)) 

(start  total-pos) 

(stop  (+  total-pos  inc)) 

) 

(set£  (nth  posl  arrayl)  (subseq  tsl  start  stop)) 

(satq  total-pos  (+  total-pos  inc)) 

) ) 

arrayl ) 

(de£un  top-portion  (slot-name  count) 

(setq  cnt  count) 

(satq  loop-stop  0) 

(cond 

( (aval  list-in£-generated)  nil) 

(t  (gen-sorted-in£-list) ) ) 

(let* 

( (pos£  (position  slot-neune  dist-slots) ) 

(£t8l  (copy-list  (nth  pos£  sorted-in£-list) ) ) 

(last  (nth  (1-  count)  £t8l)) 

(in£x  (get-value  (aval  last)  'vuln-in£luence) ) 

(valx  (get-value  (aval  last)  slot-neune) ) 

(prodx  (*  ln£x  valx))) 

(loop  while  (eq  loop-stop  0) 
do 

(let* 

( (next  (nth  cnt  £tsl) ) 

(in£y  (get-value  (aval  next)  ■ vuln-in£luence) ) 

(valy  (get-value  (aval  next)  slot-name)) 

(prody  (*  in£y  valy))) 

(cond 

((eql  prody  prodx)  (setq  cnt  (1+  cnt))) 

(t  (setq  loop-stop  1))))) 

(subseq  £tsl  0  cnt))) 


(de£un  top-contrib  () 

( preponderance ) 

(setq  pent  (£loor  (*  (car  prepond)  0.10))) 

(setq  all-c  (mapear  #' (lambda  (x)  (top-portion  x  pent))  dist-slots)) 
(setq  candidates  (sort 

(remove-duplicates  (£latten  all-c)) 

# ' string-lessp) ) 
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(setq  withdups  (flatten  all-c) ) 

(setq  candcount  (mapcar  #' (lambda  (x) 

(count  X  withdups))  candidates)) 

(setq  Ipos  (1-  (length  candidates))) 

(setq  tc  (make-list  7  ; initial-element  cjmdidates) ) 

(loop  for  pos  from  0  to  Ipos 
do 

(setq  tmpval  (nth  pos  candidates)) 

(case  (nth  pos  candcount) 

((7)  t) 

((6)  (setf 

(subseq  tc  6  7) 

(mapcar 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  6  7)))) 
((5)  (setf 

(subseq  tc  5  7) 

(mapcar 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  5  7)))) 
((4)  (setf 

(subseq  tc  4  7) 

(mapcar 

#' (lambda  (x)  (remove  tn^val  x) )  (subseq  tc  4  7)))) 
((3)  (setf 

(subseq  tc  3  7) 

(mapcar 

#'(l2unbda  (x)  (remove  tmpval  x) )  (subseq  tc  3  7)))) 
((2)  (setf 

(subseq  tc  2  7) 

(mapcar 

#■ (lambda  (x)  (remove  tn^pval  x) )  (subseq  tc  2  7)))) 
((1)  (setf 

(subseq  tc  1  7) 

(mapcar 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  1  7)))) 

)) 

t) 
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A’2  Qualitative  Method  using  Fuzzy  Analysis  (Table  Lookup) 


The  following  code  excerpts  are  provicfed  to  illustrate  high  level  functionality  of  the 
impleniented  code. 


Functions  used  to  generate  statistics 


(defun  sum-eml  (slot-name) 

"Calculate  Influence  values  for  a  particular  functional  area" 

(setf  inf-a\im  (sum-em  '  vuln-influence) ) 

(setf  suml  'vl) 

(mapc  #' (lambda  (x) 

(progn 

(cond 

( (vuln-present  (eval  x) ) 

(setf  value  (get-value  (eval  x)  slot-neune) ) 

(setf  value2  (vuln-influence  (eval  x) ) ) 

(setf  value!  (divf  value2  inf-sum) ) 

(setf  sviml  (addf  suml  (multf  value  value!)))) 

(t  (setf  suml  (addf  'vl  suml)))))) 
vuln-list) 

svunl) 

(defun  sum-em2  (slot-name) 

"Calculate  the  squared  influence  values  for  a  particular  functional  area" 
(setf  sum2  'vl) 

(mapc  #' (lambda  (x) 

(progn 

(cond 

((vuln-present  (eval  x) ) 

(setf  value  (get-value  (eval  x)  slot-neune) ) 

(setf  value2  (vuln-influence  (eval  x) ) ) 

(setf  sum2  (addf  sum2  (multf  (multf  value  value2) 

(multf  value  value2) ) ) ) ) 

(t  (setf  s\im2  (addf  'vl  suml)))))) 
vuln-list) 
sum2) 

(defun  sort-eml  (slot-name) 

"Returns  a  list  of  nodes  sorted  by  influence  given  for  a  given  slot" 

(setf  current-sort-slot  slot-name) 

(setf  sort-list  (copy-list  vuln-list)) 

(sort  sort-list  #' sort-order-inf -value) ) 

(defun  sort-order-inf-value  (x  y) 

(let* 

( (valx  (get-value  (eval  x)  current-sort-slot) ) 

(valy  (get-value  (eval  y)  current-sort-slot) ) 

(infx  (vuln-influence  (eval  x) ) ) 

(infy  (vuln-influence  (eval  y) ) ) 
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(pvx  {position  valx  fuzzy-valuas) ) 

(pvy  (position  valy  fuzzy-values) ) 

(pix  (position  in£x  fuzzy-values) ) 

(piy  (position  infy  fuzzy-values) ) 

(posx  (position  (multf  infx  valx)  fuzzy-values) ) 

(posy  (position  (multf  infy  valy)  fuzzy-values) ) ) 

(cond 

((>  posx  posy)  t) 

((and  (=  posx  posy)  (>  pix  piy))  t) 

( («md  (=  posx  posy)  (=  pix  piy)  (>  pvx  pvy))  t) 

(  (Md  (=  posx  posy)  (=  pix  piy)  (=  pvx  pvy)  (string-lessp  x  y) )  t) 
(t  nil)))) 

(defun  rawsuml  () 

“Initialize  variable  for  sum  of  each  slot* 

(setf  inf-sum  (sum-em  ' vuln-inf luence) ) 

(setf  areas  (mapcar  #'sum-eml  dist-slots) ) 
areas) 


Functions  used  to  generate  clusterings 


(defun  count-fuzzy-content  (slot-name) 

(setq  f array  (list  0000000)) 

(cond 

( (eval  list-inf-generated)  nil) 

(t  (gen-sorted-inf-list) ) ) 

(setg  posf  (position  slot-neune  dist-slots)) 

(setq  f-temp-sorted-list  (copy-list  (nth  posf  sorted-inf-list) ) ) 
(setq  Ifuz  (1-  (length  fuzzy-values))) 

(loop  for  posl  from  0  to  (1-  (length  f-temp-sorted-list)) 
do 

(let* 

((temp-fvuln  (nth  posl  f-ten:5)-sorted-list) ) 

(fvaluel  (get-value  (eval  temp-fvuln)  slot-n£une) ) 
(fvalue2  (vuln-inf luence  (eval  teir^-fvuln) ) ) 
(f-temp-inf  (multf  fvaluel  fvalue2)) 

(posx  (-  Ifuz  (position  f-temp-inf  fuzzy-values))) 

) 

(setf  (nth  posx  farray)  (1+  (nth  posx  farray) ) ) 

)) 

farray) 

(defun  group-fuzzy-content  (slot-name) 

(setq  farrayl  (list  nil  nil  nil  nil  nil  nil  nil)) 

(cond 

( (eval  list-inf-generated)  nil) 

(t  (gen-sorted-inf-list))) 

(setq  posf  (position  slot-name  dist-slots)) 

(setq  f-tsl  (copy-list  (nth  posf  sorted-inf-list))) 

(setq  total-pos  0) 

(setq  Hist  (count-fuzzy-content  slot-neune) ) 

(setq  Ifuz  (1-  (length  fuzzy-values))) 

(loop  for  posl  from  0  to  Ifuz 
do 

(let* 

((inc  (nth  posl  Hist)) 

(start  total-pos) 

(stop  (+  total-pos  inc)) 

) 

(setf  (nth  posl  farrayl)  (subseq  f-tsl  start  stop) ) 
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(setq  total -pos  (+  total -pos  inc) ) 

)) 

f arrayl ) 

(defun  top-portion  (slot-name  count) 

(setq  cnt  count) 

(setq  loop-stop  0) 

(cond 

( (eval  list-inf-generated)  nil) 

(t  (gen-sorted-inf-list) ) ) 

(let* 

( (posf  (position  slot-name  dist-slots) ) 

(ftsl  (copy-list  (nth  posf  sorted- inf-list) ) ) 

(last  (nth  (1-  count)  ftsl)) 

(infx  (get-value  (eval  last)  ' vuln-influence) ) 

(valx  (get-value  (eval  last)  slot-name)) 

(prodx  (multf  infx  valx) ) ) 

(loop  while  (eq  loop-stop  0) 
do 

(let* 

( (next  (nth  cnt  ftsl) ) 

(infy  (get-value  (eval  next)  ’vuln-influence)) 

(valy  (get-value  (eval  next)  slot-name) ) 

(prody  (multf  infy  valy) ) ) 

(cond 

( (eql  prody  prodx)  (setq  cnt  (1+  cnt) ) ) 

(t  (setq  loop-stop  1))))) 

(subseq  ftsl  0  cnt) ) ) 

(defun  top-contrib  () 

( preponderance ) 

(setq  pent  (floor  (*  (car  prepond)  0.10))) 

(setq  all-c  (mapear  #' (lambda  (x)  (top-portion  x  pent))  dist-slots)) 
(setq  candidates  (sort 

(remove-duplicates  (flatten  all-c)) 

# ' string-lessp) ) 

(setq  withdups  (flatten  all-c)) 

(setq  ceuidcount  (mapear  #'(lauabda  (x) 

(count  X  withdups) )  candidates) ) 

(setq  Ipos  (1-  (length  C£mdidates ) ) ) 

(setq  tc  (make-list  7  : initial-element  candidates)) 

(loop  for  pos  from  0  to  Ipos 
do 

(setq  tmpval  (nth  pos  candidates)) 

(case  (nth  pos  candcount) 

((7)  t) 

((6)  (setf 

(subseq  tc  6  7) 

(mapear 

#' (lambda  (x)  (remove  tn^val  x) )  (subseq  tc  6  7)))) 
((5)  (setf 

(subseq  tc  5  7) 

(mapear 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  5  7)))) 
((4)  (setf 

(subseq  tc  4  7) 

(mapear 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  4  7)))) 
((3)  (setf 

(subseq  tc  3  7) 

(mapear 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  3  7)))) 
((2)  (setf 

(subseq  tc  2  7) 

(mapear 
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#' (lambda  (x)  (remove  tnqpval  x) )  (subaeq  tc  2  7)))) 
(d)  (setf 

(subaeq  tc  1  7) 

(mapcar 

#' (lambda  (x)  (remove  tmpval  x) )  (subseq  tc  1  7)))) 

) ) 
t) 

(defun  calc-inf  () 

"Calculate  the  influence  for  each  functional  area" 

(preponderance)  ;  generate  list  of  #  vuln  present  in  each  area 
(rawsuml)  ;  generate  raw  weighted  sums 

(setf  area-avgs  (mapcar  '(lambda  (x)  (divf  x  inf-sum))  areas)) 

) 


The  following  are  the  fuzzy  math  functions  used  to  implement  the  table  lookup 


; ;  Define  Fuzzy  math  functions 


(setf  fuzzy-values  (list  'vl  '1  ''-1  'm  'mh  'h  ‘vh) ) 
(setf  vl  "Very  Low") 

(setf  1  "Low") 

(setf  ml  "Medium  Low") 

(setf  m  "Medium") 

(setf  mh  "Mediiim  High") 

(setf  h  "High") 

(setf  vh  "Very  High") 

(setf  addf-array  (append 


(list 

(list 

•VL 

■L 

•ML 

•M 

•MH 

•H 

•VH)  ) 

(list 

(list 

•L 

■ML 

'M 

•MH 

•H 

•VH 

•VH)  ) 

(list 

(list 

‘ML 

'M 

•MH 

•H 

•VH 

•VH 

•VH)  ) 

(list 

(list 

■M 

•MH 

•H 

•VH 

•VH 

•VH 

•VH)  ) 

( list 

(list 

'MH 

•H 

•VH 

•VH 

•VH 

•VH 

•VH)  ) 

( list 

(list 

•H 

■VH 

•VH 

•VH 

•VH 

•VH 

•VH)  ) 

(list 

( list 

•VH 

•VH 

•VH 

•VH 

•VH 

•VH 

•VH) ) ) ) 

(setf  multf-array  (append 


(list 

(list  'VL  'VL 

•VL 

•VL 

•VL 

•VL 

•VL)  ) 

(list 

(list  'VL  'VL 

•VL 

•VL 

•VL 

•L 

•L)) 

(list 

(list  'VL  'VL 

•VL 

•L 

•L 

•L 

•ML)  ) 

(list 

(list  'VL  'VL 

•L 

•L 

•ML 

•ML 

•M)  ) 

(list 

(list  'VL  'VL 

•L 

•ML 

•ML 

•M 

'MH)  ) 

(list 

(list  'VL  'L 

•L 

•ML 

•M 

•MH 

'H)) 

(list 

(list  'VL  'L 

•ML 

•M 

•MH 

•H 

'VH)  )  )  ) 

(setf  divf-array  (append 

(list  (list  'VH  'ML 

•L 

•VL 

•VL 

•VL 

'VL)  ) 

(list 

(list  'VH  'VH 

•MH 

•ML 

•ML 

•L 

■L)  ) 

(list 

(list  'VH  'VH 

•VH 

•MH 

•M 

•M 

•ML)  ) 

(list 

(list  'VH  'VH 

•VH 

•VH 

•H 

•MH 

'M)  ) 

( list 

(list  'VH  'VH 

•VH 

•VH 

•VH 

•H 

•MH)  ) 

(list 

(list  'VH  'VH 

•MH 

•VH 

•VH 

•VH 

'H)  ) 

dist 

(list  'VH  'VH 

•VH 

•VH 

■VH 

•VH 

'VH) ) ) ) 

(setf  subf -array  (append 
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(list 

(list 

'VL 

•VL 

•VL 

•VL 

■VL 

•VL 

•VL)  ) 

( list 

(list 

'L 

'VL 

•VL 

•VL 

•VL 

•VL 

•VL)  ) 

( list 

(list 

'ML 

'L 

'VL 

•VL 

•VL 

•VL 

•VL)  ) 

(list 

(list 

'M 

•ML 

•L 

•VL 

•VL 

•VL 

•VL)  ) 

( list 

(list 

'MH 

'M 

•ML 

•L 

•VL 

•VL 

•VL)  ) 

( list 

(list 

'H 

•MH 

'M 

•L 

•L 

'  VL 

•VL)  ) 

(list 

( list 

'VH 

•H 

'MH 

•M 

•M 

•L 

•VL) ) ) ) 

(defun  addf  (valuel  value2) 

“Will  return  the  result  of  'adding'  the  two  fuzzy  values" 

(setf  posl  (position  valuel  fuzzy-values) ) 

(setf  pos2  (position  value2  fuzzy-values) ) 

(nth  pos2  (nth  posl  addf-array) ) ) 

(defun  subf  (valuel  value2) 

“Will  return  the  result  of  'subtracting'  the  two  fuzzy  values* 
(setf  posl  (position  valuel  fuzzy-values)) 

(setf  pos2  (position  value2  fuzzy-values)) 

(nth  pos2  (nth  posl  subf-array) ) ) 

(defun  multf  (valuel  value2) 

“Will  return  the  result  of  'multiplying'  the  two  fuzzy  values" 
(setf  posl  (position  vaxiel  fuzzy-values)) 

(setf  pos2  (position  value2  fuzzy-values)) 

(nth  pos2  (nth  posl  multf-array) ) ) 

(defun  divf  (valuel  value2) 

"Will  return  the  result  of  'dividing'  the  two  fuzzy  values" 
(setf  posl  (position  valiiel  fuzzy-values)) 

(setf  pos2  (position  value?  fuzzy-values)) 

(nth  pos2  (nth  posl  divf-array) ) ) 
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Appendix  B:  Sample  Program  Output 


B-1  Program  Output  •  Numeric  Method  (SO  Samples) 


Results  of  Vulnerability  Processing 

Total  number  of  Vulnerabilities  Processed:  50 

Statistical  Results  by  Functional  Area 


Audit 

Recovery 

Access 

Media 

0/S 

Conf ig 

Docs 

Influence 

7.2378 

7.3191 

17.2539 

6.0062 

12.5051 

12.2245 

15.8863 

Average 

0.1448 

0.1464 

0.3451 

0.1201 

0.2501 

0.2445 

0.3177 

Std  Deviation 

0.1268 

0.1530 

0.2478 

0.1601 

0.2169 

0.1757 

0.2192 

Inf luence-Pct 

9.23% 

9.33% 

22.00% 

7.66% 

15.94% 

15.59% 

20.25% 

Significant  Contributors  to  each  Functional  Area 


Audit 

VlOO 

V125 

V131 

V143 

V148 

Recovery 

VI 12 

VllO 

V125 

V131 

V148 

Access 

V125 

V142 

V106 

VlOl 

V113 

Media 

V112 

V106 

VllO 

V125 

Vlll 

Operating  Sys . 

V125 

VlOl 

V115 

VlOO 

V144 

Configuration 

V143 

V131 

V137 

V142 

V125 

Documentation 

V127 

V125 

V115 

V142 

V131 

Vulnerabilities  Contributing  to  more  than  one  Functional  Area 


Two  Areas 

VlOO 

VlOl 

V106  VllO  V112  V115  V125  V131 

V142 

V143 

V148 

Three  Areas 

V125 

V131 

V142 

Four  Areas 

V125 

V131 

Five  Areas 

V125 

Six  Areas 

V125 

Seven  Areas 

V125 
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Vulnerabilities  Rankings  for  Each  Functional  Area 


Functional  Area:  AUDIT 


sigma 

Group : 

1 

=  =  > 

VlOO 

V125 

V131 

Sigma 

Group : 

2 

=  =  > 

V143 

V148 

V115 

VlOl 

V142 

V112 

Sigma 

Group : 

3 

=  => 

V137 

V141 

V117 

V140 

V144 

V129 

V135 

V133 

V118 

V124 

V134 

V122 

Sigma 

Group : 

4 

=  =  > 

V139 

V128 

V113 

V136 

V130 

V132 

V127 

V114 

Vlll 

V105 

V108 

V107 

V106 

V119 

VllO 

V147 

V138 

V120 

V145 

V103 

V146 

V116 

V102 

V149 

V109 

V104 

V123 

V126 

Functional  Area: 

;  RECOVER 

Sigma 

Group : 

2 

==> 

V112 

Sigma 

Group : 

3 

==> 

VllO 

V125 

V131 

V148 

VlOO 

Vlll 

Sigma 

Group : 

4 

=  =  > 

V102 

VlOl 

V103 

V117 

V142 

V140 

V144 

V115 

V105 

V143 

V135 

V137 

Sigma 

Group : 

5 

=  =  > 

V124 

V141 

V104 

V133 

V113 

V118 

V139 

V122 

V132 

V129 

V130 

V107 

V128 

V136 

V138 

V121 

V127 

V123 

V145 

V114 

V108 

V146 

V119 

V134 

V116 

V149 

V120 

V147 

V126 

V109 

Functional  Area:  ACCESS 

Sigma  Group 

:  1  ==> 

V125 

V142 

V106 

VlOl 

V113 

V112 

V143 

Sigma  Group: 

A 

II 

II 

OI 

V115 

VlOO 

V131 

V137 

V148 

V144 

V141 

V128 

V124 

Sigma  Group; 

:  3  ==> 

V117 

V132 

V135 

V114 

VllO 

V139 

V129 

V133 

V119 

V140 

V118 

V122 

Vlll 

V105 

V102 

V136 

V108 

V103 

V130 

Sigma  Group: 

:  4  ==> 

V138 

V134 

V116 

V104 

V146 

V149 

V107 

V109 

V145 

V127 

V147 

V120 

V121 

V123 

Functional  Area:  MEDIA 

Sigma  Group 

;  1  ==> 

V112 

Sigma  Group 

to 

II 

II 

V 

V106 

Sigma  Group 

u> 

II 

II 

V 

VllO 

V125 

Sigma  Group 

A 

II 

II 

Vlll 

VI 02 

V108 

V109 

V119 

VI 05 

V135 

V148 

V107 

V131 

V140 

V146 

VlOO 

Sigma  Group; 

A 

II 

II 

in 

V117 

V115 

V104 

V13 

VlOl 

V137 

V143 

V122 

V133 

V138 

V142 

V129 

V127 

V130 

V114 

V144 

V121 

V124 

V139 

V128 

V134 

V141 

V132 

V120 

V118 

V126 

V145 

V123 

V136 

V147 

V103 

V149 
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Functional  Area:  OS 
Sigma  Group:  1  ==> 

V125 

VlOl 

Sigma  Group:  2  ==> 

V124 

V131 

Sigma  Group:  3  ==> 

V117 

V137 

V135 

V129 

Sigma  Group:  4  ==> 

VllO 

V113 

V138 

V149 

V145 

V116 

Functional  Area:  CONFIGURATION 

Sigma  Group:  1  ==> 

V143 

V131 

Sigma  Group:  2  ==> 

VlOl 

V141 

Sigma  Group:  3  ==> 

V135 

V140 

V118 

V139 

V108 

V138 

Sigma  Group:  4  ==> 

VI 04 

V127 

VI 2  6 

Functional  Area:  DOCUMENTATION 

Sigma  Group:  1  ==> 

V127 

V125 

Sigma  Group:  2  ==> 

VlOl 

V106 

Sigma  Group:  3  ==> 

VI 12 

V114 

V124 

V133 

V108 

V130 

VllO 

Sigma  Group:  4  ==> 

V138 

V103 

V105 

V146 

V115 

VlOO 

V144 

V143 

V127 

V148 

V142 

V122 

V112 

V141 

V128 

V140 

V133 

V132 

V109 

V106 

V118 

V139 

V119 

Vlll 

V130 

V136 

V114 

V108 

V107 

V120 

V146 

V103 

V105 

V147 

V102 

V104 

V134 

V126 

V123 

V137 

V142 

V125 

V148 

V128 

V115 

V113 

V106 

V112 

V124 

VlOO 

V117 

VI 14 

V133 

V129 

V132 

V144 

V122 

V119 

Vlll 

VllO 

V103 

V130 

V136 

V146 

V105 

V102 

V109 

V107 

V149 

V120 

V145 

V147 

V116 

V134 

V121 

V115 

V142 

V131 

V143 

V137 

V148 

V113 

V128 

V141 

VlOO 

V144 

V109 

V117 

V118 

V132 

V122 

V135 

V119 

V129 

Vlll 

V140 

V139 

V116 

V149 

V107 

V120 

V134 

V145 

V104 

V147 

V136 

V121 

V102 

V123 
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List  of  Vulnerabilities 


VlOO 

VlOl 

V102 

V103 

V104 

V105 

V106 

V107 

V108 

V109 

VllO 

Vlll 

V112 

VI 13 

V114 

V115 

V116 

V117 

V118 

V119 

V120 

V121 

V122 


The  system  does  not  have  audit  trails . 

During  logon,  the  system  does  not  tell  the  user  the  date  emd 
time  the  ID  was  last  used 

:  Off-site  backup  does  not  exist  for  critical  files 
:  Off -site  storage  is  not  a  secure  area 

:  Procedures  have  not  been  designed  to  insure  only  authorized 
personnel  have  access  to  the  storage  media 

Storage  media  reserved  for  use  by  a  contractor  is  not  clear  of 
all  classified  or  sensitive  data 

External  tape  or  disk  identification  labels  do  not  include 
security  classification 

:  Media  records  are  not  kept . 

:  Inventory  records  do  not  show  media  on  hand  in  the  library 

Guidelines  eind  controls  have  not  been  established  to  designate 
an  individual  as  disk  m2jiager  for  each  computer  system 

:  Backup  is  not  available  for  power  (generator  or  batteries) 

:  Backup  is  not  available  for  air  conditioning 

:  Backups  of  software  for  critical  applications  are  not  coit^ared 
to  working  copies  to  detect  unauthorized  chemges. 

:  Security  test  and  evaluation  is  not  performed  prior  to 
certification 

A  contingency  plan  does  not  exist. 

:  System  documentation  does  not  include  detailed  information 
concerning  software  use  for  the  user 

:  Security  documentation  does  not  include  configuration  management 
controls 

:  For  PC  (single-user)  systems,  files  of  different  classifications 
are  not  limited  to  authorized  users 

:  Classification  of  software  does  not  take  into  account  algorithms 
or  processes  that  may  be  used 

:  Classification  of  media  is  not  downgraded  by  reviewing  all 
information 

:  For  periods  processing  the  system  does  not  use  separate  copies 
of  the  operating  system 

:  Protection  of  ADP  magnetic  storage  media  does  not  include 

safeguarding  media  according  to  the  highest  classification  ever 
recorded 

:  The  operating  system  does  not  automatically  label  all 
human-readable  output  with  its  sensitivity 


78 


V123  :  Main  memory  and  storage  devices  are  not  cleared  before  being 
assigned  to  einother  individual  or  process. 

V124  :  The  operating  system  does  not  require  users  to  identify 
themselves  before  performing  any  actions 

V125  :  The  operating  system  does  not  use  a  protected  mechanism  (e.g., 
passwords)  to  authenticate  user  identity 

V126  :  Testing  is  not  performed  to  insure  that  there  are  no  ways  for  aui 
unauthorized  user  to  gain  access  to  the  system 

V127  ;  Documentation  does  not  exist  that  describes  operating  system 
protection  mechanisms 

V128  :  Documentation  does  not  exist  that  outlines  the  test  plan, 
procedures  and  results  for  security  testing 

V129  :  A  trusted  facility  manual  does  not  Include  procedures  for  the 
operator  to  operate  the  facility  in  a  secure  manner 

V130  :  Passwords  are  not  randomly  generated 

V131  :  The  password  administrator's  responsibilities  do  not  include 
sole  access  to  the  password  file 

V132  .  Requirement (s)  not  met  are  that  private  data  passwords  are  known 
only  by  the  creator 

V133  ;  Password  mcmagement  and  control  does  not  consist  of  a  single 
point  of  contact 

V134  :  Audit  trails  for  password  distribution  and  change  are  not  in 
existence . 

V135  :  Passwords  are  not  changed  at  least  every  three  months 

V136  :  Compromised  or  mishandled  passwords  are  not  changed  at  least 
within  one  wor)ting  day 

V137  :  Personal  passwords  are  not  deleted  within  three  work  days  when  a 
user  leaves  the  organization 

V138  ;  Group  passwords  are  not  changed  within  three  work  days  when  a 
user  leaves  the  organization. 

V139  :  Resource  protection  measures  do  not  include  making  personnel 
responsible  for  protection  of  government  property 

V140  :  The  keys  and  combinations  of  the  room  are  not  restricted  to  a 
limited  number  of  holders 

V141  :  The  keys  and  combinations  of  the  room  are  not  changed  on  a 
regular  basis 

V142  :  The  doors  and  gates  of  the  room  are  not  kept  closed  at  all  times 

V143  :  The  windows  of  the  room  are  not  kept  closed  at  all  times 

V144  :  Systems  are  not  located  such  that  access  is  controlled 

V145  :  Controls  for  small  computer  users  do  not  include  cold-booting  at 
the  start  of  each  session  if  classified 


V146  :  Diskettes  are  not  write-protected  when  it  is  appropriate  to  do 
so 

V147  :  Users  do  not  know  they  should  not  use  personally  owned  computers 
or  systems  at  home  for  Air  Force  business 

V148  :  Changes  to  software  are  not  documented 

V149  :  Access  to  utility  software  is  not  limited  to  specifically 
identified  personnel. 
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B-2  Program  Output  -  Fuuy  Method  (50  Samples) 


Results  of  Vulnerability  Processing 

Total  number  of  Vulnerabilities  Processed:  SO 


Statistical  Results  by  Functional  Area 


VH 

H 

MH 

M 

ML 

L 

VL 

Audit 

0 

0 

0 

2 

5 

14 

29 

Recovery 

1 

0 

0 

1 

6 

12 

30 

Access 

3 

4 

6 

4 

5 

18 

10 

Media 

1 

0 

1 

1 

2 

11 

34 

Operating  Sys. 

1 

2 

3 

6 

4 

12 

22 

Configuration 

0 

0 

4 

6 

7 

14 

19 

Documentation 

1 

4 

5 

5 

8 

18 

9 

Significant  Contributors  to 

each 

Functional  Area 

Audit 

F125 

FlOO 

FlOl 

F112 

F131 

F143 

F148 

Recovery 

F112 

FllO 

FlOl 

F125 

FlOO 

F131 

F148 

Fill 

Access 

FlOl 

F125 

F142 

F112 

F106 

F113 

F143 

Media 

F112 

F106 

FllO 

F125 

Fill 

Operating  Sys. 

FlOl 

F125 

F115 

FlOO 

F124 

F144 

Configuration 

F142 

F131 

F143 

F137 

FlOl 

F112 

F125 

F115 

F148 

F128 

Documentation 

F127 

F125 

F142 

F115 

F143 

Vulnerabilities 

Contributing  to  more  than  one 

Functional 

Area 

Two  Areas 

FlOO 

FlOl 

F106 

FllO 

Fill 

F112 

F115 

F125 

F131 

F142 

F143 

F148 

Three  Areas 

FlOO 

F148 

FlOl 

F112 

F115 

F125 

F131 

F142 

F143 

Four  Areas 

FlOl 

F112 

F125 

F143 

Five  Areas 

FlOl 

F112 

F125 

Six  Areas  F125 

Seven  Areas  F125 
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Vulnerabilities  Rankings  for  Each  Functional  Area 


Functional  Area:  AUDIT 


Importance; 

M 

=  =  > 

F125 

FIDO 

Importance : 

ML 

=  =  > 

FlOl 

F112 

F131 

P143 

F148 

Importance. 

L 

=  =  > 

F122 

F142 

F115 

F124 

F144 

F137 

F141 

F117 

F133 

F135 

F118 

P129 

F134 

F140 

Importance ; 

;  VL 

=  =  > 

F127 

F106 

F113 

F108 

F128 

F132 

FllO 

F136 

F139 

F105 

Fill 

P114 

F119 

F102 

F109 

F130 

F107 

F103 

F138 

F146 

F104 

F116 

F120 

F147 

F123 

F126 

F145 

P149 

F121 

Functional 

Area: 

;  RECOVER 

Importance ; 

VH 

==> 

F112 

Importance : 

M 

==> 

FllO 

Importance; 

ML 

=  =  > 

FlOl 

F125 

FlOO 

F131 

F148 

Fill 

Importance ; 

L 

==> 

F142 

F115 

F124 

F143 

F144 

F117 

F135 

F102 

F105 

F103 

F104 

P140 

Importance ; 

;  VL 

==> 

F122 

F127 

F106 

F113 

F137 

F141 

F132 

F133 

F108 

F128 

F118 

P129 

F139 

F109 

F114 

F119 

F136 

F107 

F130 

P138 

F146 

F116 

F134 

F121 

F123 

F126 

F145 

F149 

F120 

F147 

Functional 

Importance: 

Area: 

VH 

;  ACCESS 
==>  FlOl 

F125 

F142 

Importance ; 

H 

==> 

F112 

F106 

F113 

F143 

Importance; 

MH 

=  =  > 

F115 

F131 

F144 

F137 

F141 

F148 

Importance ; 

M 

==> 

FlOO 

F124 

F117 

F128 

Importance ; 

ML 

==> 

F132 

F135 

F133 

F118 

F119 

Importance ; 

L 

==> 

F122 

F136 

F134 

FllO 

F102 

F146 

F108 

F130 

F114 

F138 

F129 

F140 

F139 

F103 

F105 

F104 

Fill 

F116 

Importance : 

VL 

==> 

F127 

F147 

F109 

F126 

F107 

F121 

F123 

F145 

F149 

F120 
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Functional  Area:  MEDIA 


iR^ort  juice : 

VH 

ss> 

F112 

Inqportance ; 

MH 

==> 

F106 

Importance : 

M 

==> 

FllO 

Importance : 

ML 

==> 

F125 

Fill 

Importance : 

L 

==> 

FlOO 

F131 

F108 

F117 

F135 

F109 

F102 

F105 

F119 

F146 

F107 

Importance : 

:  VL 

==> 

FlOl 

F122 

F127 

F142 

F113 

F115 

F124 

F143 

F144 

F148 

F137 

P141 

F128 

F132 

F133 

F114 

F118 

F129 

F136 

P139 

F104 

F140 

F130 

F138 

F103 

F116 

F134 

F121 

F123 

F126 

F120 

F145 

F147 

F149 

Functional 

Area: 

:  OS 

Importance ; 

VH 

==> 

FlOl 

Importance: 

H 

==> 

F125 

F115 

Importance ; 

MH 

==> 

FlOO 

F124 

F144 

Importance : 

M 

=  =  > 

F122 

F127 

F142 

F131 

F143 

F148 

Importance ; 

ML 

==> 

F112 

F137 

F141 

F117 

Importance ; 

L 

=  =  > 

F106 

F113 

F128 

P132 

F133 

F135 

F109 

F118 

F129 

F119 

F139 

F140 

Importance ; 

;  VL 

=  =  > 

F108 

FllO 

F105 

Fill 

F136 

F102 

F114 

F130 

F107 

F138 

F146 

P103 

F104 

F116 

F134 

F120 

F149 

F126 

F147 

F123 

F145 

F121 

Functional 

Area 

:  CONFIGURATION 

Importance: 

MH 

==> 

F142 

F131 

F143 

F137 

Importance : 
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List  of  Vulnerabilities 
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F113 
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F115 

F116 

F117 

F118 

F119 

F120 

F121 

F122 


The  system  does  not  have  audit  trails. 

During  logon,  the  system  does  not  tell  the  user  the  date  euid 
time  the  ID  was  last  used 

;  Off-site  backup  does  not  exist  for  critical  files 

;  Off-site  storage  is  not  a  secure  area 

Procedures  have  not  been  designed  to  insure  only  authorized 
personnel  have  access  to  the  storage  media 

Storage  media  reserved  for  use  by  a  contractor  is  not  clear  of 
all  classified  or  sensitive  data 

External  tape  or  disk  identification  labels  do  not  include 
security  classification 

Media  records  are  not  kept . 

:  Inventory  records  do  not  show  media  on  hand  in  the  library 

Guidelines  and  controls  have  not  been  established  to  designate 
an  individual  as  disk  mcuiager  for  each  computer  system 

:  Backup  is  not  available  for  power  (generator  or  batteries) 

Backup  is  not  available  for  air  conditioning 

Backups  of  software  for  critical  applications  are  not  compared 
to  working  copies  to  detect  unauthorized  changes. 

Security  test  and  evaluation  is  not  performed  prior  to 
certification 

:  A  contingency  plan  does  not  exist. 

System  documentation  does  not  include  detailed  information 
concerning  software  use  for  the  user 

:  Security  documentation  does  not  include  configuration  management 
controls 

For  PC  (single-user)  systems,  files  of  different  classifications 
are  not  limited  to  authorized  users 

:  Classification  of  software  does  not  take  into  account  algorithms 
or  processes  that  may  be  used 

Classification  of  media  is  not  downgraded  by  reviewing  all 
information 

For  periods  processing  the  system  does  not  use  separate  copies 
of  the  operating  system 

;  Protection  of  ADP  magnetic  storage  media  does  not  include 

safeguarding  media  according  to  the  highest  classification  ever 
recorded 

The  operating  system  does  not  automatically  label  all 
human-readable  output  with  its  sensitivity 
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:  Main  memory  and  storage  devices  are  not  cleared  before  being 
assigned  to  another  individual  or  process. 

:  The  operating  system  does  not  require  users  to  identify 
themselves  before  performing  any  actions 

The  operating  system  does  not  use  a  protected  mechanism  (e.g., 
passwords)  to  authenticate  user  identity 

:  Testing  is  not  performed  to  insure  that  there  are  no  ways  for  an 
unauthorized  user  to  gain  access  to  the  system 

:  Documentation  does  not  exist  that  describes  operating  system 
protection  mechanisms 

:  Documentation  does  not  exist  that  outlines  the  test  plan, 
procedures  and  results  for  security  testing 

:  A  trusted  facility  manual  does  not  include  procedures  for  the 
operator  to  operate  the  facility  in  a  secure  manner 

:  Passwords  are  not  randomly  generated 

:  The  password  administrator's  responsibilities  do  not  include 
sole  access  to  the  password  file 

:  Requirement (3)  not  met  are  that  private  data  passwords  are  known 
only  by  the  creator 

:  Password  management  and  control  does  not  consist  of  a  single 
point  of  contact 

:  Audit  trails  for  password  distribution  and  chamge  are  not  in 
existence. 

:  Passwords  are  not  changed  at  least  every  three  months 

:  Compromised  or  mishandled  passwords  are  not  changed  at  least 
within  one  working  day 

:  Personal  passwords  are  not  deleted  within  three  work  days  when  a 
user  leaves  the  orgamization 

:  Group  passwords  are  not  cheinged  within  three  work  days  when  a 
user  leaves  the  organization. 

:  Resource  protection  measures  do  not  include  making  personnel 
responsible  for  protection  of  government  property 

:  The  keys  and  combinations  of  the  room  are  not  restricted  to  a 
limited  number  of  holders 

:  The  keys  and  combinations  of  the  room  are  not  changed  on  a 
regular  basis 

:  The  doors  and  gates  of  the  room  are  not  kept  closed  at  all  times 

:  The  windows  of  the  room  are  not  kept  closed  at  all  times 

:  Systems  are  not  located  such  that  access  is  controlled 

:  Controls  for  small  computer  users  do  not  include  cold-booting  at 
the  start  of  each  session  if  classified 
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F146 

F147 

F148 

F149 


Diskettes  are  not  write-protected  when  it  is  appropriate  to  do 
so 

:  Users  do  not  know  they  should  not  use  personally  owned  conqputers 
or  systems  at  home  for  Air  Force  business 

:  Changes  to  software  are  not  documented 

:  Access  to  utility  software  is  not  limited  to  specifically 
identified  personnel. 
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